From haosun at openjdk.org Fri Aug 1 00:04:02 2025 From: haosun at openjdk.org (Hao Sun) Date: Fri, 1 Aug 2025 00:04:02 GMT Subject: RFR: 8361950: Update to use jtreg 8 In-Reply-To: References: Message-ID: <6CRZBpGwS69ZgWKLmIG7UOj9jPqVxvieJKEAT9lao-Y=.af3bdd20-0a45-4928-8c5b-197b377e8ba3@github.com> On Wed, 30 Jul 2025 16:26:55 GMT, Jaikiran Pai wrote: > Hello Hao, > > > Hi, I encountered two jtreg failures with this new version `8` on both AArch64 and x86_64 platforms. > > Note that these two jtreg cases can pass with jtreg `7.5.2+1` version. > > Thank you for bringing this up. I'm able to reproduce this issue with this newer version of jtreg. I'll take a look to see what's going on. Thanks for your confirm, Jaikiran. Forgot to mention that I tested `jdk_all, hotspot_all, langtools_all` with jtreg `8` and only found these two test failures. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26261#issuecomment-3141675024 From dlong at openjdk.org Fri Aug 1 00:49:22 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 1 Aug 2025 00:49:22 GMT Subject: RFR: 8278874: tighten VerifyStack constraints [v9] In-Reply-To: References: Message-ID: > The VerifyStack logic in Deoptimization::unpack_frames() attempts to check the expression stack size of the interpreter frame against what GenerateOopMap computes. To do this, it needs to know if the state at the current bci represents the "before" state, meaning the bytecode will be reexecuted, or the "after" state, meaning we will advance to the next bytecode. The old code didn't know how to determine exactly what state we were in, so it checked both. This PR cleans that up, so we only have to compute the oopmap once. It also removes old SPARC support. Dean Long has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Merge branch 'openjdk:master' into 8278874-verifystack - more cleanup - simplify is_top_frame - readability suggestion - reviewer suggestions - Update src/hotspot/share/runtime/vframeArray.cpp Co-authored-by: Manuel H?ssig - Update src/hotspot/share/runtime/vframeArray.cpp Co-authored-by: Manuel H?ssig - better name for frame index - Update src/hotspot/share/runtime/deoptimization.cpp Co-authored-by: Manuel H?ssig - fix optimized build - ... and 2 more: https://git.openjdk.org/jdk/compare/91c07ccd...6bfda158 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26121/files - new: https://git.openjdk.org/jdk/pull/26121/files/6257de6c..6bfda158 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26121&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26121&range=07-08 Stats: 59108 lines in 1555 files changed: 33964 ins; 16451 del; 8693 mod Patch: https://git.openjdk.org/jdk/pull/26121.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26121/head:pull/26121 PR: https://git.openjdk.org/jdk/pull/26121 From dlong at openjdk.org Fri Aug 1 02:38:59 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 1 Aug 2025 02:38:59 GMT Subject: RFR: 8278874: tighten VerifyStack constraints [v9] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 00:49:22 GMT, Dean Long wrote: >> The VerifyStack logic in Deoptimization::unpack_frames() attempts to check the expression stack size of the interpreter frame against what GenerateOopMap computes. To do this, it needs to know if the state at the current bci represents the "before" state, meaning the bytecode will be reexecuted, or the "after" state, meaning we will advance to the next bytecode. The old code didn't know how to determine exactly what state we were in, so it checked both. This PR cleans that up, so we only have to compute the oopmap once. It also removes old SPARC support. > > Dean Long has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge branch 'openjdk:master' into 8278874-verifystack > - more cleanup > - simplify is_top_frame > - readability suggestion > - reviewer suggestions > - Update src/hotspot/share/runtime/vframeArray.cpp > > Co-authored-by: Manuel H?ssig > - Update src/hotspot/share/runtime/vframeArray.cpp > > Co-authored-by: Manuel H?ssig > - better name for frame index > - Update src/hotspot/share/runtime/deoptimization.cpp > > Co-authored-by: Manuel H?ssig > - fix optimized build > - ... and 2 more: https://git.openjdk.org/jdk/compare/ad91eaa9...6bfda158 I'm running Graal testing now and I've hit at least one of my new asserts so far. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26121#issuecomment-3141966534 From missa at openjdk.org Fri Aug 1 02:47:00 2025 From: missa at openjdk.org (Mohamed Issa) Date: Fri, 1 Aug 2025 02:47:00 GMT Subject: RFR: 8360559: Optimize Math.sinh for x86 64 bit platforms [v3] In-Reply-To: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> References: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> Message-ID: On Thu, 31 Jul 2025 21:32:16 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.sinh() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. >> >> The command to run all range specific micro-benchmarks is posted below. >> >> `make test TEST="micro:SinhPerf.SinhPerfRanges"` >> >> The results of all tests posted below were captured with an [Intel? Xeon 8488C](https://advisor.cloudzero.com/aws/ec2/r7i.metal-24xl) using [OpenJDK v26-b4](https://github.com/openjdk/jdk/releases/tag/jdk-26%2B4) as the baseline version. >> >> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides an an average uplift of 64% when input values fall into the middle three ranges where heavy computation is required. However, very small inputs and very large inputs show drops of 74% and 66% respectively. >> >> | Input range(s) | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup | >> | :------------------------------------: | :-------------------------------: | :--------------------------------: | :--------: | >> | [-2^(-28), 2^(-28)] | 844160 | 216029 | 0.26x | >> | [-22, -2^(-28)], [2^(-28), 22] | 81662 | 157351 | 1.93x | >> | [-709.78, -22], [22, 709.78] | 119075 | 167635 | 1.41x | >> | [-710.48, -709.78], [709.78, 710.48] | 111636 | 177125 | 1.59x | >> | (-INF, -710.48], [710.48, INF) | 959296 | 313839 | 0.33x | >> >> Finally, the `jtreg:test/jdk/java/lang/Math/HyperbolicTests.java` test passed with the changes. > > Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: > > Make sure xmm1 is initialized before potential use in L_2TAG_PACKET_3_0_2 @eme64 Available to run the additional tests? Also, a couple of the pre-submit checks had failures unrelated to the code changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26152#issuecomment-3141980741 From stuefe at openjdk.org Fri Aug 1 04:42:53 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 1 Aug 2025 04:42:53 GMT Subject: RFR: 8343218: Disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Good ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3077618779 From kbarrett at openjdk.org Fri Aug 1 05:00:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 1 Aug 2025 05:00:59 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v13] In-Reply-To: <2HazQXRv2W1X8noflzufBpZEZtxURBLlelofJz36G18=.8f9d4c0e-9af4-4c70-a156-b57b6c9a39d4@github.com> References: <2HazQXRv2W1X8noflzufBpZEZtxURBLlelofJz36G18=.8f9d4c0e-9af4-4c70-a156-b57b6c9a39d4@github.com> Message-ID: On Wed, 30 Jul 2025 14:08:09 GMT, Anton Artemov wrote: > > Maybe we should have an `ATTRIBUTE_NODISCARD` macro with an always empty > > definition, and add it to all the functions being changed? Otherwise, once > > `[[nodiscard]]` becomes available, someone is going to have to hunt down all > > the relevant functions. Once we have C++17 we can revisit. It might take some > > preliminary call-site fixes before we can change the macro (or more likely, > > remove it and just use `[[nodiscard]]` directly). > > @kimbarrett How about to change the return type from bool to some enum class? They are not implicitly convertible to integers, so one should not be able to ignore the return value of a function then. And we will not need a `[[nodiscard]]` in this case. I don't see how that helps at all. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-3142176108 From kbarrett at openjdk.org Fri Aug 1 05:05:58 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 1 Aug 2025 05:05:58 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v13] In-Reply-To: References: Message-ID: On Wed, 30 Jul 2025 14:42:24 GMT, Anton Artemov wrote: >> Note that the main point here was the cast to (void) used to indicate that the return value is explicitly discarded. Does that not work with the [[nodiscard]]? > > Sorry, I completely missed that point. Need to check this. Explicitly casting to void is the canonical idiom for ignoring [[nodiscard]]. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2246914427 From kbarrett at openjdk.org Fri Aug 1 05:08:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 1 Aug 2025 05:08:59 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v13] In-Reply-To: References: Message-ID: On Wed, 30 Jul 2025 13:32:20 GMT, Anton Artemov wrote: > > 1. I really would like to see all the UL "X failed" log lines removed from this patch. > > @stefank I understand your concerns. I can remove the logging for failed cases, but it will demise the proposed pattern. Without consensus on how to log such cases for JVM users, which is to be done in a later phase, I suggest to put TODO comments to the places where such logging should be done. What do you think? The point of my suggested "dummy" ATTRIBUTE_NODISCARD is that it serves the purpose of such TODO comments. If you undummify that macro then you get warnings at any places that haven't addressed the use of the return value. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-3142188027 From jsikstro at openjdk.org Fri Aug 1 07:46:00 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Fri, 1 Aug 2025 07:46:00 GMT Subject: RFR: 8364248: Separate commit and reservation limit detection [v6] In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 10:20:52 GMT, Joel Sikstr?m wrote: >> The function os::has_allocatable_memory_limit() is intended to determine whether there is a system-imposed limit on how much memory can be committed, and if so, what that limit is. On POSIX systems, limiting committable memory is typically enforced by restricting the available virtual address space, such as via RLIMIT_AS. As a result, os::has_allocatable_memory_limit() tells us both how much memory can be committed and how much virtual address space is available. On Windows, os::has_allocatable_memory_limit() always returns true, along with the size of the available virtual address space. This is misleading because it is not possible to limit how much memory can be committed via virtual address space on Windows, and also the virtual address space cannot be limited. >> >> ZGC currently uses os::has_allocatable_memory_limit() to check if the virtual address space is limited (i.e. if there's a limit on how much memory can be reserved). To make it clear that the virtual address space cannot be limited on Windows, I propose that we create a new function called os::reserve_memory_limit(), which returns the upper-bound of how much memory can be reserved. This should just be SIZE_MAX on Windows, since the virtual address space cannot be limited. >> >> Additionally, we should rename os::has_allocatable_memory_limit() to os::commit_memory_limit(), to be more in line with os::reserve_memory_limit() and read nicely next to the os::commit_memory() and os::reserve_memory() functions. The return type of the new os::commit_memory_limit() should also only be size_t, not bool + out-parameter, to fit better next to os::reserve_memory_limit(). >> >> As a follow-up, I think it is reasonable to re-visit the implementation of os::has_allocatable_memory_limit() (will be os::commit_memory_limit()) on Windows, since it doesn't follow any user-set limits, apart from how much virtual memory is available. Perhaps looking at limit(s) set by Job Objects could be more fruitful, and would improve the support for native Windows containers (Hyper-V). >> >> Testing: >> * Oracle's tier1-2 >> * Manual testing on Linux by limiting the virtual address space: >> >> $ ulimit -v 8388608 && java -XX:+UseZGC -Xlog:gc+init -version > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > commit_memory_limit return size_t instead of bool + out-parameter I've rerun testing, both some manual limit-setting and tier1-2, and looks good. Thank you for the reviews! @albertnetymk @tstuefe @stefank ------------- PR Comment: https://git.openjdk.org/jdk/pull/26530#issuecomment-3143569133 From jsikstro at openjdk.org Fri Aug 1 07:46:01 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Fri, 1 Aug 2025 07:46:01 GMT Subject: Integrated: 8364248: Separate commit and reservation limit detection In-Reply-To: References: Message-ID: On Tue, 29 Jul 2025 11:24:48 GMT, Joel Sikstr?m wrote: > The function os::has_allocatable_memory_limit() is intended to determine whether there is a system-imposed limit on how much memory can be committed, and if so, what that limit is. On POSIX systems, limiting committable memory is typically enforced by restricting the available virtual address space, such as via RLIMIT_AS. As a result, os::has_allocatable_memory_limit() tells us both how much memory can be committed and how much virtual address space is available. On Windows, os::has_allocatable_memory_limit() always returns true, along with the size of the available virtual address space. This is misleading because it is not possible to limit how much memory can be committed via virtual address space on Windows, and also the virtual address space cannot be limited. > > ZGC currently uses os::has_allocatable_memory_limit() to check if the virtual address space is limited (i.e. if there's a limit on how much memory can be reserved). To make it clear that the virtual address space cannot be limited on Windows, I propose that we create a new function called os::reserve_memory_limit(), which returns the upper-bound of how much memory can be reserved. This should just be SIZE_MAX on Windows, since the virtual address space cannot be limited. > > Additionally, we should rename os::has_allocatable_memory_limit() to os::commit_memory_limit(), to be more in line with os::reserve_memory_limit() and read nicely next to the os::commit_memory() and os::reserve_memory() functions. The return type of the new os::commit_memory_limit() should also only be size_t, not bool + out-parameter, to fit better next to os::reserve_memory_limit(). > > As a follow-up, I think it is reasonable to re-visit the implementation of os::has_allocatable_memory_limit() (will be os::commit_memory_limit()) on Windows, since it doesn't follow any user-set limits, apart from how much virtual memory is available. Perhaps looking at limit(s) set by Job Objects could be more fruitful, and would improve the support for native Windows containers (Hyper-V). > > Testing: > * Oracle's tier1-2 > * Manual testing on Linux by limiting the virtual address space: > > $ ulimit -v 8388608 && java -XX:+UseZGC -Xlog:gc+init -version This pull request has now been integrated. Changeset: ae11d8f4 Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/ae11d8f44689502d35cb511e9ce288ab7cc0acae Stats: 107 lines in 5 files changed: 45 ins; 38 del; 24 mod 8364248: Separate commit and reservation limit detection Reviewed-by: stuefe, ayang ------------- PR: https://git.openjdk.org/jdk/pull/26530 From duke at openjdk.org Fri Aug 1 09:11:54 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 1 Aug 2025 09:11:54 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v22] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static size_t physical_memory(size_t& value); > > > The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, and unsuccessful call is logged. > > `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: - 8357086: Removed extra line - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/22c752c2..ab62a520 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=20-21 Stats: 50 lines in 14 files changed: 5 ins; 15 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Fri Aug 1 09:11:55 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 1 Aug 2025 09:11:55 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v18] In-Reply-To: References: <_JFKdc6j9FxrcIvMknk5I3aYHKU_FzJtxiCWrgM1IQc=.38cdd378-f156-43d0-a4ee-e81afdc16f90@github.com> <24xIFc0iPvkbR2lM-x92rIDage4gZ2_VZVu036XKc1o=.d554b735-d60a-4723-a5ac-85c431d0db5e@github.com> Message-ID: On Thu, 31 Jul 2025 15:23:51 GMT, Stefan Karlsson wrote: >> Okay, makes sense, it looks like a failure of `os::physical_memory()` is something very unlikely to happen. So I made this function `void`, but, to be consistent with the rest, it provides the value by writing to the input parameter. > > 1) I had hoped for some kind of sanity check in the initialization code that sets up _physical_memory. We can handle that as a separate RFE. > > 2) I really would prefer if os::physical_memory() returned size_t via the normal means and only use output parameters for the functions that require more than one return value. And on that note ... > > 3) Do the other functions (free_memory and available_memory) ever fail or could we fix those as well? Windows and AIX are using the same APIs that are used for os::physical_memory. Linux uses sysinfo, which only seems to fail if we pass in an incorrect value. BSD has an assert that would trigger if this fails. 1) As for sanity checks: - In `os::init()` in os_linux.cpp there is a sanity check for sysconf. It is done just before calling` Linux::initialize_system_info()` where `_physical_memory` is set. - In `os::Aix::initialize_system_info()` there is already a sanity check. - In `os::Bsd::initialize_system_info()` there is a check for success of `sysctl`, but a default value 256 * 1024 * 1024 is assigned to `_physical_memory`. I do not know why it is done this way. Maybe this needs to be addressed separately. - In `os::win32::initialize_system_info()` there is a call to `GlobalMemoryStatusEx`, but above we've came to conclusion that it never fails. So all OS sysinfo init routines have a sort of sanity check, except BSD, but I guess it was done intentionally. 2) Addressed. 3) As for the rest of the methods: - `available_memory()` can fail on Linux only if sysinfo fails, which can happen. Since it can fail on one OS, then the function should return both success/failure and the resulting value. - `used_memory()` is never used in the code. Maybe it should be completely removed? - `free_memory()`: the same argument applies as to `available_memory()`. - `total_swap_space()`: the same argument applies as to `available_memory()`. - `free_swap_space()`: the same argument applies as to `available_memory()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2247391028 From duke at openjdk.org Fri Aug 1 09:19:55 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 1 Aug 2025 09:19:55 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v23] In-Reply-To: References: Message-ID: <-SbK2XJEVxUmf9O-3-mnb3-ULRDnmxT-U3RUs_LDdVQ=.1e31ff6e-3455-4cbe-b323-cdecb94fe08e@github.com> > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static size_t physical_memory(size_t& value); > > > The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, and unsuccessful call is logged. > > `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: - 8357086: Fixed merge conflict - 8357086: Removed extra line - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD - 8357086: Small fixes - 8357086: Made physical_memory() return void - 8357086: Fixed behavior of total_swap_space on Linux - 8357086: Addressed reviewer's comments. - 8357086: Fixed void conversion. - 8357086: Addressed reviewer's comments - 8357086: Addressed reviewer's comments - ... and 14 more: https://git.openjdk.org/jdk/compare/d80b5c87...f9b2c6d8 ------------- Changes: https://git.openjdk.org/jdk/pull/25450/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=22 Stats: 253 lines in 23 files changed: 92 ins; 2 del; 159 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Fri Aug 1 09:53:55 2025 From: duke at openjdk.org (duke) Date: Fri, 1 Aug 2025 09:53:55 GMT Subject: RFR: 8364296: Set IntelJccErratumMitigation flag ergonomically In-Reply-To: References: Message-ID: On Wed, 30 Jul 2025 16:27:42 GMT, Oli Gillespie wrote: > We should update the flag if we are using a computed value. Nobody else reads IntelJccErratumMitigation specifically, but we want it to be correctly shown in PrintFlagsFinal and anywhere else these flags are inspected. > > > Intel(R) Xeon(R) Platinum 8259CL > java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep IntelJcc > Before: > bool IntelJccErratumMitigation = true {ARCH diagnostic} {default} > After: > bool IntelJccErratumMitigation = true {ARCH diagnostic} {ergonomic} > > > Even worse when it's actually false, but shows as true: > > > AMD EPYC 7R13 > java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep IntelJcc > Before: > bool IntelJccErratumMitigation = true {ARCH diagnostic} {default} > After: > bool IntelJccErratumMitigation = false {ARCH diagnostic} {ergonomic} @olivergillespie Your change (at version 7f4f5debaa24ce4784372d96d37400de65489481) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26560#issuecomment-3143948732 From duke at openjdk.org Fri Aug 1 10:09:04 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 1 Aug 2025 10:09:04 GMT Subject: RFR: 8364519: Sort share/classfile includes Message-ID: This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. Passes tier1. ------------- Commit messages: - sort Changes: https://git.openjdk.org/jdk/pull/26590/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26590&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364519 Stats: 44 lines in 18 files changed: 22 ins; 21 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26590.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26590/head:pull/26590 PR: https://git.openjdk.org/jdk/pull/26590 From mbaesken at openjdk.org Fri Aug 1 10:27:01 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 1 Aug 2025 10:27:01 GMT Subject: RFR: 8364199: Enhance list of environment variables printed in hserr/hsinfo file In-Reply-To: References: Message-ID: On Mon, 28 Jul 2025 15:09:12 GMT, Matthias Baesken wrote: > We have a number of interesting environment variables influencing the JVM/JDK that could be added to the list of environment variables printed in the hserr/hsinfo file. > JAVA_OPTS is used by lots of tools . > WAYLAND_DISPLAY ( https://github.com/openjdk/jdk/blob/965b68107ffe1c1c988d4faf6d6742629407451b/src/java.desktop/unix/classes/sun/awt/UNIXToolkit.java#L492) > and JDK_AOT_VM_OPTIONS (https://github.com/openjdk/jdk/blob/965b68107ffe1c1c988d4faf6d6742629407451b/src/java.base/share/man/java.md?plain=1#L4052) are referenced in the codebase. Thanks for the reviews ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26509#issuecomment-3144065194 From mbaesken at openjdk.org Fri Aug 1 10:27:01 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 1 Aug 2025 10:27:01 GMT Subject: Integrated: 8364199: Enhance list of environment variables printed in hserr/hsinfo file In-Reply-To: References: Message-ID: <55hHU099PxC1szUKel9i8vJxmU4CBbFSczs_3aDMPfA=.d9a5fd6a-f6ba-4313-9e1d-a1737f16275d@github.com> On Mon, 28 Jul 2025 15:09:12 GMT, Matthias Baesken wrote: > We have a number of interesting environment variables influencing the JVM/JDK that could be added to the list of environment variables printed in the hserr/hsinfo file. > JAVA_OPTS is used by lots of tools . > WAYLAND_DISPLAY ( https://github.com/openjdk/jdk/blob/965b68107ffe1c1c988d4faf6d6742629407451b/src/java.desktop/unix/classes/sun/awt/UNIXToolkit.java#L492) > and JDK_AOT_VM_OPTIONS (https://github.com/openjdk/jdk/blob/965b68107ffe1c1c988d4faf6d6742629407451b/src/java.base/share/man/java.md?plain=1#L4052) are referenced in the codebase. This pull request has now been integrated. Changeset: 812bd8e9 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/812bd8e94d22f9751651e28a2ef8affdf6a33220 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod 8364199: Enhance list of environment variables printed in hserr/hsinfo file Reviewed-by: lucy, clanger ------------- PR: https://git.openjdk.org/jdk/pull/26509 From ogillespie at openjdk.org Fri Aug 1 10:30:00 2025 From: ogillespie at openjdk.org (Oli Gillespie) Date: Fri, 1 Aug 2025 10:30:00 GMT Subject: Integrated: 8364296: Set IntelJccErratumMitigation flag ergonomically In-Reply-To: References: Message-ID: On Wed, 30 Jul 2025 16:27:42 GMT, Oli Gillespie wrote: > We should update the flag if we are using a computed value. Nobody else reads IntelJccErratumMitigation specifically, but we want it to be correctly shown in PrintFlagsFinal and anywhere else these flags are inspected. > > > Intel(R) Xeon(R) Platinum 8259CL > java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep IntelJcc > Before: > bool IntelJccErratumMitigation = true {ARCH diagnostic} {default} > After: > bool IntelJccErratumMitigation = true {ARCH diagnostic} {ergonomic} > > > Even worse when it's actually false, but shows as true: > > > AMD EPYC 7R13 > java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep IntelJcc > Before: > bool IntelJccErratumMitigation = true {ARCH diagnostic} {default} > After: > bool IntelJccErratumMitigation = false {ARCH diagnostic} {ergonomic} This pull request has now been integrated. Changeset: 6c580472 Author: Oli Gillespie Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/6c5804722b5b2064e0d6ade2180c3126d8f2dabc Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8364296: Set IntelJccErratumMitigation flag ergonomically Reviewed-by: shade, jbhateja ------------- PR: https://git.openjdk.org/jdk/pull/26560 From shade at openjdk.org Fri Aug 1 11:08:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 1 Aug 2025 11:08:54 GMT Subject: RFR: 8364519: Sort share/classfile includes In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 10:03:39 GMT, Francesco Andreuzzi wrote: > This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. > > Passes tier1. Looks fine, with a nit. src/hotspot/share/classfile/systemDictionary.cpp line 63: > 61: #include "oops/objArrayOop.inline.hpp" > 62: #include "oops/oop.hpp" > 63: #include "oops/oop.inline.hpp" Take this opportunity to remove the include of `oop.hpp`, since we are including `oop.inline.hpp` anyway? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26590#pullrequestreview-3078791694 PR Review Comment: https://git.openjdk.org/jdk/pull/26590#discussion_r2247700897 From duke at openjdk.org Fri Aug 1 13:03:11 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 1 Aug 2025 13:03:11 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: > This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. > > Passes tier1. Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: remove redundant includes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26590/files - new: https://git.openjdk.org/jdk/pull/26590/files/4629ccf9..3477cbce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26590&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26590&range=00-01 Stats: 6 lines in 5 files changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26590.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26590/head:pull/26590 PR: https://git.openjdk.org/jdk/pull/26590 From duke at openjdk.org Fri Aug 1 13:03:12 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 1 Aug 2025 13:03:12 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 11:05:35 GMT, Aleksey Shipilev wrote: >> Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: >> >> remove redundant includes > > src/hotspot/share/classfile/systemDictionary.cpp line 63: > >> 61: #include "oops/objArrayOop.inline.hpp" >> 62: #include "oops/oop.hpp" >> 63: #include "oops/oop.inline.hpp" > > Take this opportunity to remove the include of `oop.hpp`, since we are including `oop.inline.hpp` anyway? Sure, I got a few here: 3477cbce67776546bbb0cbc0cf18302a6dab2c97 Could this be a possible improvement of `SortIncludes.java`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26590#discussion_r2247927317 From shade at openjdk.org Fri Aug 1 13:07:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 1 Aug 2025 13:07:53 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 13:03:11 GMT, Francesco Andreuzzi wrote: >> This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. >> >> Passes tier1. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > remove redundant includes Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26590#pullrequestreview-3079171165 From shade at openjdk.org Fri Aug 1 13:07:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 1 Aug 2025 13:07:54 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 13:00:24 GMT, Francesco Andreuzzi wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 63: >> >>> 61: #include "oops/objArrayOop.inline.hpp" >>> 62: #include "oops/oop.hpp" >>> 63: #include "oops/oop.inline.hpp" >> >> Take this opportunity to remove the include of `oop.hpp`, since we are including `oop.inline.hpp` anyway? > > Sure, I got a few here: 3477cbce67776546bbb0cbc0cf18302a6dab2c97 > > Could this be a possible improvement of `SortIncludes.java`? Maybe, but it is not exactly the include order issue. We do these clean ups opportunistically, when it is glaringly obvious it is an unnecessary include. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26590#discussion_r2247940487 From shade at openjdk.org Fri Aug 1 14:00:03 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 1 Aug 2025 14:00:03 GMT Subject: RFR: 8343218: Disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Generally looks fine. Stylistic/process note, though: the RFE synopsis still reads that we are disabling the optimization. We are not doing it here, we are only guarding it with a flag. Either submit a new RFE or rename this bug? ------------- PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3079349456 From coleenp at openjdk.org Fri Aug 1 14:58:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 1 Aug 2025 14:58:53 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: <89cyGwzQ6a3VsmWbL0U2tW-gmDryL6BTqeH8GHUjCQU=.65e2d51f-7250-4e41-a5e3-297a806d5a79@github.com> On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Okay, I fixed the titles and will discuss with @vnkozlov and @TobiHartmann next week about getting this into JDK 25 or not. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26579#issuecomment-3144862351 From duke at openjdk.org Fri Aug 1 15:00:03 2025 From: duke at openjdk.org (ExE Boss) Date: Fri, 1 Aug 2025 15:00:03 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v2] In-Reply-To: <3KAsrKOGuC7XInNaIEmocMu2vX8RTL5dOJhwsnFnSPs=.6c5dc374-6807-416d-9c78-cd664cd04f6f@github.com> References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> <5rPWGuXUbiRbQ7TlwHJnQDjRYSuWRITbbo0OUtMfhrY=.cfe52d81-3450-477f-b5d8-02b3590971d0@github.com> <3KAsrKOGuC7XInNaIEmocMu2vX8RTL5dOJhwsnFnSPs=.6c5dc374-6807-416d-9c78-cd664cd04f6f@github.com> Message-ID: On Wed, 30 Jul 2025 10:57:33 GMT, Coleen Phillimore wrote: >> I?mean the?existing private?fields of?`SharedSecrets`[^1] so?that the?JIT is?able to?constant?fold calls?to?`SharedSecrets?::getJava*Access()` so?that it?becomes equally?as?performant as?using a?`Holder`?class. >> >> Modifiers are?transient, as?those are?sourced?from the?`InnerClasses`?attribute, which?can?be?changed when?classes are?redefined by?a?**JVMTI**?agent, but?access?flags can?t?be?changed through?redefinition. >> >> [^1]: https://github.com/openjdk/jdk/blob/330ee871315348594171c43aa75b58f6027001af/src/java.base/share/classes/jdk/internal/access/SharedSecrets.java#L62-L92 > > I don't think inner class attributes can be changed via JVMTI RedefineClasses either. I'm pretty sure we check and that's considered a change of schema. File an RFE to make SharedSecrets fields @Stable though for a different change. I?don?t?have a?**JBS**?account, and?I?don?t?like going?through , so?could someone?else file?said?**RFE**? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2248196045 From rriggs at openjdk.org Fri Aug 1 15:12:59 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Fri, 1 Aug 2025 15:12:59 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v2] In-Reply-To: References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> <5rPWGuXUbiRbQ7TlwHJnQDjRYSuWRITbbo0OUtMfhrY=.cfe52d81-3450-477f-b5d8-02b3590971d0@github.com> <3KAsrKOGuC7XInNaIEmocMu2vX8RTL5dOJhwsnFnSPs=.6c5dc374-6807-416d-9c78-cd664cd04f6f@github.com> Message-ID: <_0yq0VEEVe1apROVSN5nckimKR_PQOurkq3Rc_qxCwo=.6e7f9c28-6ab6-4561-8336-8133d14badf5@github.com> On Fri, 1 Aug 2025 14:57:03 GMT, ExE Boss wrote: >> I don't think inner class attributes can be changed via JVMTI RedefineClasses either. I'm pretty sure we check and that's considered a change of schema. File an RFE to make SharedSecrets fields @Stable though for a different change. > > I?don?t?have a?**JBS**?account, and?I?don?t?like going?through , so?could someone?else file?said?**RFE**? Created https://bugs.openjdk.org/browse/JDK-8364540 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2248224660 From coleenp at openjdk.org Fri Aug 1 15:25:02 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 1 Aug 2025 15:25:02 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v5] In-Reply-To: References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> Message-ID: On Wed, 30 Jul 2025 19:25:34 GMT, Coleen Phillimore wrote: >> This change removes the intrinsic for getClassAccessFlagsRaw for reflection and initializes an rawAccessFlags field in java.lang.Class instead, that Java code can non-natively access. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore c2 optimization. Thank you for reviewing Tobias, Roger and Chen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26517#issuecomment-3144938447 From coleenp at openjdk.org Fri Aug 1 15:25:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 1 Aug 2025 15:25:03 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v2] In-Reply-To: <_0yq0VEEVe1apROVSN5nckimKR_PQOurkq3Rc_qxCwo=.6e7f9c28-6ab6-4561-8336-8133d14badf5@github.com> References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> <5rPWGuXUbiRbQ7TlwHJnQDjRYSuWRITbbo0OUtMfhrY=.cfe52d81-3450-477f-b5d8-02b3590971d0@github.com> <3KAsrKOGuC7XInNaIEmocMu2vX8RTL5dOJhwsnFnSPs=.6c5dc374-6807-416d-9c78-cd664cd04f6f@github.com> <_0yq0VEEVe1apROVSN5nckimKR_PQOurkq3Rc_qxCwo=.6e7f9c28-6ab6-4561-8336-8133d14badf5@github.com> Message-ID: On Fri, 1 Aug 2025 15:10:02 GMT, Roger Riggs wrote: >> I?don?t?have a?**JBS**?account, and?I?don?t?like going?through , so?could someone?else file?said?**RFE**? > > Created https://bugs.openjdk.org/browse/JDK-8364540 Thanks Roger. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2248246788 From coleenp at openjdk.org Fri Aug 1 15:25:04 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 1 Aug 2025 15:25:04 GMT Subject: Integrated: 8364187: Make getClassAccessFlagsRaw non-native In-Reply-To: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> Message-ID: <2HVJZKZ8FwStZduaGH4jA-zQsd8VDH4VF_1lnO1lDlY=.790ad8b0-d8da-43c5-81d5-97387b6ddcad@github.com> On Mon, 28 Jul 2025 20:14:15 GMT, Coleen Phillimore wrote: > This change removes the intrinsic for getClassAccessFlagsRaw for reflection and initializes an rawAccessFlags field in java.lang.Class instead, that Java code can non-natively access. > Tested with tier1-4. This pull request has now been integrated. Changeset: ee3665bc Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/ee3665bca026fe53409df8391d49477c64ae23a2 Stats: 128 lines in 16 files changed: 61 ins; 36 del; 31 mod 8364187: Make getClassAccessFlagsRaw non-native Reviewed-by: thartmann, rriggs, liach ------------- PR: https://git.openjdk.org/jdk/pull/26517 From coleenp at openjdk.org Fri Aug 1 15:25:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 1 Aug 2025 15:25:03 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v5] In-Reply-To: References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> Message-ID: On Tue, 29 Jul 2025 14:35:06 GMT, Chen Liang wrote: >> This is a good suggestion but I made it Raw to match getRawClassAnnotations this name. > > Thanks for the rename. I think `raw annotations` means the uninterpreted byte data in the attribute is carried over raw; this concept is less applicable to the access_flags u2 value. Thank you for the suggested name change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2248245044 From shade at openjdk.org Fri Aug 1 15:37:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 1 Aug 2025 15:37:53 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3079681502 From ayang at openjdk.org Fri Aug 1 15:53:04 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 1 Aug 2025 15:53:04 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 13:03:11 GMT, Francesco Andreuzzi wrote: >> This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. >> >> Passes tier1. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > remove redundant includes Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26590#pullrequestreview-3079734419 From duke at openjdk.org Fri Aug 1 16:04:54 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 1 Aug 2025 16:04:54 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: <7vOMHTGdPPZzFKxyspJufQsdfHwdFtrO_PUv9ihZatw=.0912751b-c695-4fa4-9b9b-c2544e6bdd58@github.com> On Fri, 1 Aug 2025 13:03:11 GMT, Francesco Andreuzzi wrote: >> This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. >> >> Passes tier1. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > remove redundant includes The test failure does not seem to be related to the changes in this PR ------------- PR Comment: https://git.openjdk.org/jdk/pull/26590#issuecomment-3145054636 From duke at openjdk.org Fri Aug 1 16:04:55 2025 From: duke at openjdk.org (duke) Date: Fri, 1 Aug 2025 16:04:55 GMT Subject: RFR: 8364519: Sort share/classfile includes [v2] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 13:03:11 GMT, Francesco Andreuzzi wrote: >> This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. >> >> Passes tier1. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > remove redundant includes @fandreuz Your change (at version 3477cbce67776546bbb0cbc0cf18302a6dab2c97) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26590#issuecomment-3145056176 From mablakatov at openjdk.org Fri Aug 1 16:05:42 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Fri, 1 Aug 2025 16:05:42 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v3] In-Reply-To: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> > Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. > > This has passed tier1-3 and jcstress testing on AArch64. Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' - Address review comments from @theRealAph and @eastig - use relocInfo::relocType instead of TrampolineCallKind - move rtype from keys to values; reduce the footprint of the patch - Lift the vm.opt.TieredCompilation == null requirement from the tests - Combine the two shared trampoline request hash tables - 8359359: AArch64: share trampolines between static calls to the same method Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. ------------- Changes: https://git.openjdk.org/jdk/pull/25954/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25954&range=02 Stats: 431 lines in 9 files changed: 311 ins; 110 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25954.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25954/head:pull/25954 PR: https://git.openjdk.org/jdk/pull/25954 From aph at openjdk.org Fri Aug 1 17:38:59 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Aug 2025 17:38:59 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v3] In-Reply-To: <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> Message-ID: On Fri, 1 Aug 2025 16:05:42 GMT, Mikhail Ablakatov wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' > - Address review comments from @theRealAph and @eastig > - use relocInfo::relocType instead of TrampolineCallKind > - move rtype from keys to values; reduce the footprint of the patch > - Lift the vm.opt.TieredCompilation == null requirement from the tests > - Combine the two shared trampoline request hash tables > - 8359359: AArch64: share trampolines between static calls to the same method > > Modify the C2 compiler to share trampoline stubs between static calls > that resolve to the same callee method. Since the relocation target for > all static calls is initially set to the static call resolver stub, the > call's target alone cannot be used to distinguish between different > static method calls. Instead, trampoline stubs should be shared based > on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of > trampolines among static calls. However, due to imprecise log analysis, > the test currently passes even when trampolines are not shared. > Additionally, comments within the test suggest ambiguity regarding > whether it was intended to assess trampoline sharing for static calls > or runtime calls. To address these issues and eliminate ambiguity, this > patch renames and updates the existing test. Furthermore, a new test is > introduced, using the existing one as a foundation, to accurately > evaluate trampoline sharing for both static and runtime calls. src/hotspot/cpu/aarch64/codeBuffer_aarch64.hpp line 46: > 44: void share_sc_trampoline_for(const ciMethod *callee, int caller_offset) { > 45: share_trampoline_for(relocInfo::static_call_type, (address)callee, caller_offset); > 46: } I don't think we need these two methods. They're only called once. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2248517098 From erik.joelsson at oracle.com Fri Aug 1 17:54:56 2025 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 1 Aug 2025 10:54:56 -0700 Subject: hotspot jtreg tests take WARNING_CFLAGS_JDK_CONLY flags In-Reply-To: References: Message-ID: <6a386790-d220-4e7a-8fdb-f0ddaf421035@oracle.com> Not sure about intended. The 'JDK' variable naming was introduced to basically mean not Hotspot. What flags that should apply to tests isn't always clear. I don't think that flags currently assigned to variables with JVM in the name should apply to hotspot tests. I think we just treat all native tests the same here and give them the JDK flags. This area could probably still need some careful sorting out and cleanup. /Erik On 7/10/25 08:23, Baesken, Matthias wrote: > > When? trying the GCC static analyzer (-fanalyzer flag) > > diff --git a/make/autoconf/flags-cflags.m4 b/make/autoconf/flags-cflags.m4 > > index e80d9a98957..9d1ae60047b 100644 > > --- a/make/autoconf/flags-cflags.m4 > > +++ b/make/autoconf/flags-cflags.m4 > > @@ -610,7 +610,9 @@ AC_DEFUN([FLAGS_SETUP_CFLAGS_HELPER], > > ?? # CFLAGS WARNINGS STUFF > > ?? # Set JVM_CFLAGS warning handling > > ?? if test "x$TOOLCHAIN_TYPE" = xgcc; then > > - WARNING_CFLAGS_JDK_CONLY="$WARNINGS_ENABLE_ALL_CFLAGS" > > +??? # enable -fanalyzer (but better only for gcc12 + , and also only > for C) > > +??? # too many strange / shaky fd leak warnings > > +??? WARNING_CFLAGS_JDK_CONLY="-fanalyzer -Wno-analyzer-fd-leak > $WARNINGS_ENABLE_ALL_CFLAGS" > > WARNING_CFLAGS_JDK_CXXONLY="$WARNINGS_ENABLE_ALL_CXXFLAGS" > > WARNING_CFLAGS_JVM="$WARNINGS_ENABLE_ALL_CXXFLAGS" > > I noticed that the WARNING_CFLAGS_JDK_CONLY? go into the? hotspot?? > jtreg tests, e.g.? : > > /jdk/test/hotspot/jtreg/runtime/ErrorHandling/libTestDwarfHelper.h:46:6: > error: dereference of NULL '0' [CWE-476] > [-Werror=analyzer-null-dereference] > > ?? 46 |?? *x = 34; // Crash > > ????? |?? ~~~^~~~ > > ? 'dereference_null': event 1 > > ??? | > > ??? |?? 46 |?? *x = 34; // Crash > > ??? |????? |?? ~~~^~~~ > > ??? |????? |????? | > > ??? |????? |????? (1) dereference of NULL '0' > > This might be intended but?? I was surprised that the HS? C tests take > WARNING_CFLAGS_JDK_CONLY? ??!??? Is this intended or not ? > > Best regards, Matthias > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erikj at openjdk.org Fri Aug 1 18:29:57 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 1 Aug 2025 18:29:57 GMT Subject: RFR: 8361950: Update to use jtreg 8 In-Reply-To: References: Message-ID: On Fri, 11 Jul 2025 09:05:40 GMT, Christian Stein wrote: > Please review the change to update to using jtreg 8. > > The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the requiredVersion has been updated in the various `TEST.ROOT` files. Marked as reviewed by erikj (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26261#pullrequestreview-3080178709 From mdoerr at openjdk.org Fri Aug 1 18:36:56 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Aug 2025 18:36:56 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v2] In-Reply-To: References: Message-ID: On Thu, 24 Jul 2025 18:51:22 GMT, Dean Long wrote: >> This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > remove NMethodEntryBarrier_lock Thanks for implementing it for PPC64! The instruction sequence needs to be modified (see below). I'd like to have https://github.com/TheRealMDoerr/jdk/commit/522b1ef2e75509d91ac18a1acd27275fc0305e8e, too. Should I file a separate RFE for that? src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp line 195: > 193: // This is a compound instruction. Patching support is provided by NativeMovRegMem. > 194: // Actual patching is done in (platform-specific part of) BarrierSetNMethod. > 195: __ align(8); // align for atomic update We can't do this within this fixed size instruction sequence. But, it can be fixed like this: https://github.com/TheRealMDoerr/jdk/commit/a06b34468f5cb063892f92b66d058a8f444f05a1 ------------- Changes requested by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26399#pullrequestreview-3080187466 PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2248603769 From mdoerr at openjdk.org Fri Aug 1 18:46:55 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Aug 2025 18:46:55 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v2] In-Reply-To: References: Message-ID: On Thu, 24 Jul 2025 18:51:22 GMT, Dean Long wrote: >> This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > remove NMethodEntryBarrier_lock src/hotspot/share/gc/shared/barrierSetNMethod.cpp line 113: > 111: } > 112: > 113: MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXWrite, Thread::current())); This looks also ok as alternative to my proposal. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2248631958 From coleenp at openjdk.org Fri Aug 1 19:35:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 1 Aug 2025 19:35:55 GMT Subject: RFR: 8363998: Implement Compressed Class Pointers for 32-bit [v5] In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 07:02:10 GMT, Thomas Stuefe wrote: >> We plan to remove the uncompressed Klass pointer mode (`-UseCompressedClassPointers`) soon. Uncompressed Klass pointers are deprecated for JDK 25, and are planned to be removed for JDK 26. For background and motivation, please refer to this discussion [1] and the description and comments in this deprecation CSR [2]. >> >> A significant roadblock for removing `-UseCompressedClassPointers` had been the ongoing existence of 32-bit. 32-bit relies on uncompressed Klass pointers. >> >> Now, some of us think that 32-bit ports are on their way out [3]. We already removed support for x86 with JEP 503 in JDK 25, but we still have arm32 and zero 32-bit. The usefulness of the latter is arguable, seeing that the build is broken more often than not. However, even if we do rid ourselves of 32-bit eventually, that won't happen quickly: further discussions are needed, and then we will need a proper JEP with a deprecation phase for at least one release, probably more. That is too slow - we'd like to get rid of `-UseCompressedClassPointers` a lot sooner than that. >> >> So we need to find a way to support `+UseCompressedClassPointers` on 32-bit. >> >> ------- >> >> This patch adds support for `+UseCompressedClassPointers` on 32-bit in a minimally invasive way that keeps technical debt low. The code is small, clearly marked, and can be removed cleanly once we completely remove 32-bit support in two or three releases. >> >> How to do this? Most is straightforward: pointers are 32-bit, so they are already "compressed". We set up narrow Klass encoding such that `base = NULL` and `shift = 0`; the entire 32-bit address space is therefore our encoding range. We now behave exactly as we would behave on 64-bit in "unscaled" encoding mode, with encoding and decoding being no-ops. >> >> We also don't change the object layout; header stays the same - Klass is stored in the second word of the header as before, regardless of whether you interpret the word as `narrowKlass` or `Klass*`. >> >> We also don't need to make many changes in terms of storage. Today, on 32-bit systems, Klass structures live inside the non-Klass metaspace regions, which are scattered arbitrarily throughout the address space. On 64-bit, we need a class space to confine Klass structures to the encoding range; here, where the encoding range encompasses the entire address space, we don't need a class space. Note that should we ever consider supporting some form of compact object headers on 32-bit - god forbid that ever happe... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: > > - Fix some jtreg tests on arm > - fix arm gtests > - fix CompressedKlass* gtests on arm > - Merge master > - get rid of NEEDS_CLASS_SPACE define > - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms > - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms > - since to because > > Co-authored-by: Andrew Haley > - Remove stray include > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: ExE Boss <3889017+ExE-Boss at users.noreply.github.com> > - ... and 10 more: https://git.openjdk.org/jdk/compare/559795b0...e4a0f42f Yes, this is better without NEEDS_CLASS_SPACE. Thanks. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26491#pullrequestreview-3080360029 From dlong at openjdk.org Fri Aug 1 20:09:11 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 1 Aug 2025 20:09:11 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3] In-Reply-To: References: Message-ID: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> > This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. Dean Long has updated the pull request incrementally with one additional commit since the last revision: Fix PPC64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26399/files - new: https://git.openjdk.org/jdk/pull/26399/files/e05605eb..a06b3446 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26399&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26399&range=01-02 Stats: 24 lines in 2 files changed: 11 ins; 10 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26399.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26399/head:pull/26399 PR: https://git.openjdk.org/jdk/pull/26399 From mdoerr at openjdk.org Fri Aug 1 20:09:11 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Aug 2025 20:09:11 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3] In-Reply-To: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> References: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> Message-ID: On Fri, 1 Aug 2025 20:05:56 GMT, Dean Long wrote: >> This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > Fix PPC64 Thanks for integrating my fix! ------------- PR Review: https://git.openjdk.org/jdk/pull/26399#pullrequestreview-3080433000 From mdoerr at openjdk.org Fri Aug 1 21:58:56 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Aug 2025 21:58:56 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3] In-Reply-To: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> References: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> Message-ID: On Fri, 1 Aug 2025 20:09:11 GMT, Dean Long wrote: >> This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > Fix PPC64 PPC64 code looks correct, now, but I have minor proposals. src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 84: > 82: nativeMovRegMem_at(new_mov_instr.buf)->set_offset(new_value, false /* no icache flush */); > 83: // Swap in the new value > 84: uint64_t v = Atomic::cmpxchg(instr, old_mov_instr.u64, new_mov_instr.u64, memory_order_release); We have `OrderAccess::release()` above, so `memory_order_release` looks redundant. Shouldn't we use `memory_order_relaxed`, here? src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 88: > 86: old_mov_instr.u64 = v; > 87: } > 88: ICache::ppc64_flush_icache_bytes(addr_at(0), NativeMovRegMem::instruction_size); Maybe only use flushing if `cmpxchg` succeeded? Otherwise, we didn't modify the code. ------------- PR Review: https://git.openjdk.org/jdk/pull/26399#pullrequestreview-3080627303 PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2248909656 PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2248925985 From dholmes at openjdk.org Sat Aug 2 01:49:59 2025 From: dholmes at openjdk.org (David Holmes) Date: Sat, 2 Aug 2025 01:49:59 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Looks fine for an added safety-guard. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3080841140 From stuefe at openjdk.org Sat Aug 2 05:51:56 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 2 Aug 2025 05:51:56 GMT Subject: RFR: 8363998: Implement Compressed Class Pointers for 32-bit [v5] In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 19:33:22 GMT, Coleen Phillimore wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: >> >> - Fix some jtreg tests on arm >> - fix arm gtests >> - fix CompressedKlass* gtests on arm >> - Merge master >> - get rid of NEEDS_CLASS_SPACE define >> - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms >> - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms >> - since to because >> >> Co-authored-by: Andrew Haley >> - Remove stray include >> - Update src/hotspot/share/oops/compressedKlass.cpp >> >> Co-authored-by: ExE Boss <3889017+ExE-Boss at users.noreply.github.com> >> - ... and 10 more: https://git.openjdk.org/jdk/compare/559795b0...e4a0f42f > > Yes, this is better without NEEDS_CLASS_SPACE. Thanks. Thanks @coleenp ! May I have a second approval, please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26491#issuecomment-3146247662 From rkennke at openjdk.org Sat Aug 2 06:30:04 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Sat, 2 Aug 2025 06:30:04 GMT Subject: RFR: 8363998: Implement Compressed Class Pointers for 32-bit [v5] In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 07:02:10 GMT, Thomas Stuefe wrote: >> We plan to remove the uncompressed Klass pointer mode (`-UseCompressedClassPointers`) soon. Uncompressed Klass pointers are deprecated for JDK 25, and are planned to be removed for JDK 26. For background and motivation, please refer to this discussion [1] and the description and comments in this deprecation CSR [2]. >> >> A significant roadblock for removing `-UseCompressedClassPointers` had been the ongoing existence of 32-bit. 32-bit relies on uncompressed Klass pointers. >> >> Now, some of us think that 32-bit ports are on their way out [3]. We already removed support for x86 with JEP 503 in JDK 25, but we still have arm32 and zero 32-bit. The usefulness of the latter is arguable, seeing that the build is broken more often than not. However, even if we do rid ourselves of 32-bit eventually, that won't happen quickly: further discussions are needed, and then we will need a proper JEP with a deprecation phase for at least one release, probably more. That is too slow - we'd like to get rid of `-UseCompressedClassPointers` a lot sooner than that. >> >> So we need to find a way to support `+UseCompressedClassPointers` on 32-bit. >> >> ------- >> >> This patch adds support for `+UseCompressedClassPointers` on 32-bit in a minimally invasive way that keeps technical debt low. The code is small, clearly marked, and can be removed cleanly once we completely remove 32-bit support in two or three releases. >> >> How to do this? Most is straightforward: pointers are 32-bit, so they are already "compressed". We set up narrow Klass encoding such that `base = NULL` and `shift = 0`; the entire 32-bit address space is therefore our encoding range. We now behave exactly as we would behave on 64-bit in "unscaled" encoding mode, with encoding and decoding being no-ops. >> >> We also don't change the object layout; header stays the same - Klass is stored in the second word of the header as before, regardless of whether you interpret the word as `narrowKlass` or `Klass*`. >> >> We also don't need to make many changes in terms of storage. Today, on 32-bit systems, Klass structures live inside the non-Klass metaspace regions, which are scattered arbitrarily throughout the address space. On 64-bit, we need a class space to confine Klass structures to the encoding range; here, where the encoding range encompasses the entire address space, we don't need a class space. Note that should we ever consider supporting some form of compact object headers on 32-bit - god forbid that ever happe... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: > > - Fix some jtreg tests on arm > - fix arm gtests > - fix CompressedKlass* gtests on arm > - Merge master > - get rid of NEEDS_CLASS_SPACE define > - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms > - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms > - since to because > > Co-authored-by: Andrew Haley > - Remove stray include > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: ExE Boss <3889017+ExE-Boss at users.noreply.github.com> > - ... and 10 more: https://git.openjdk.org/jdk/compare/559795b0...e4a0f42f Looks good to me! Thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26491#pullrequestreview-3080936692 From stuefe at openjdk.org Sun Aug 3 06:46:03 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sun, 3 Aug 2025 06:46:03 GMT Subject: RFR: 8363998: Implement Compressed Class Pointers for 32-bit [v5] In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 07:02:10 GMT, Thomas Stuefe wrote: >> We plan to remove the uncompressed Klass pointer mode (`-UseCompressedClassPointers`) soon. Uncompressed Klass pointers are deprecated for JDK 25, and are planned to be removed for JDK 26. For background and motivation, please refer to this discussion [1] and the description and comments in this deprecation CSR [2]. >> >> A significant roadblock for removing `-UseCompressedClassPointers` had been the ongoing existence of 32-bit. 32-bit relies on uncompressed Klass pointers. >> >> Now, some of us think that 32-bit ports are on their way out [3]. We already removed support for x86 with JEP 503 in JDK 25, but we still have arm32 and zero 32-bit. The usefulness of the latter is arguable, seeing that the build is broken more often than not. However, even if we do rid ourselves of 32-bit eventually, that won't happen quickly: further discussions are needed, and then we will need a proper JEP with a deprecation phase for at least one release, probably more. That is too slow - we'd like to get rid of `-UseCompressedClassPointers` a lot sooner than that. >> >> So we need to find a way to support `+UseCompressedClassPointers` on 32-bit. >> >> ------- >> >> This patch adds support for `+UseCompressedClassPointers` on 32-bit in a minimally invasive way that keeps technical debt low. The code is small, clearly marked, and can be removed cleanly once we completely remove 32-bit support in two or three releases. >> >> How to do this? Most is straightforward: pointers are 32-bit, so they are already "compressed". We set up narrow Klass encoding such that `base = NULL` and `shift = 0`; the entire 32-bit address space is therefore our encoding range. We now behave exactly as we would behave on 64-bit in "unscaled" encoding mode, with encoding and decoding being no-ops. >> >> We also don't change the object layout; header stays the same - Klass is stored in the second word of the header as before, regardless of whether you interpret the word as `narrowKlass` or `Klass*`. >> >> We also don't need to make many changes in terms of storage. Today, on 32-bit systems, Klass structures live inside the non-Klass metaspace regions, which are scattered arbitrarily throughout the address space. On 64-bit, we need a class space to confine Klass structures to the encoding range; here, where the encoding range encompasses the entire address space, we don't need a class space. Note that should we ever consider supporting some form of compact object headers on 32-bit - god forbid that ever happe... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: > > - Fix some jtreg tests on arm > - fix arm gtests > - fix CompressedKlass* gtests on arm > - Merge master > - get rid of NEEDS_CLASS_SPACE define > - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms > - Merge branch 'master' into JDK-8363998-Implement-compressed-class-pointers-for-32-bit-platforms > - since to because > > Co-authored-by: Andrew Haley > - Remove stray include > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: ExE Boss <3889017+ExE-Boss at users.noreply.github.com> > - ... and 10 more: https://git.openjdk.org/jdk/compare/559795b0...e4a0f42f Thanks Roman! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26491#issuecomment-3147510019 From stuefe at openjdk.org Sun Aug 3 06:46:04 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sun, 3 Aug 2025 06:46:04 GMT Subject: Integrated: 8363998: Implement Compressed Class Pointers for 32-bit In-Reply-To: References: Message-ID: On Sat, 26 Jul 2025 12:17:50 GMT, Thomas Stuefe wrote: > We plan to remove the uncompressed Klass pointer mode (`-UseCompressedClassPointers`) soon. Uncompressed Klass pointers are deprecated for JDK 25, and are planned to be removed for JDK 26. For background and motivation, please refer to this discussion [1] and the description and comments in this deprecation CSR [2]. > > A significant roadblock for removing `-UseCompressedClassPointers` had been the ongoing existence of 32-bit. 32-bit relies on uncompressed Klass pointers. > > Now, some of us think that 32-bit ports are on their way out [3]. We already removed support for x86 with JEP 503 in JDK 25, but we still have arm32 and zero 32-bit. The usefulness of the latter is arguable, seeing that the build is broken more often than not. However, even if we do rid ourselves of 32-bit eventually, that won't happen quickly: further discussions are needed, and then we will need a proper JEP with a deprecation phase for at least one release, probably more. That is too slow - we'd like to get rid of `-UseCompressedClassPointers` a lot sooner than that. > > So we need to find a way to support `+UseCompressedClassPointers` on 32-bit. > > ------- > > This patch adds support for `+UseCompressedClassPointers` on 32-bit in a minimally invasive way that keeps technical debt low. The code is small, clearly marked, and can be removed cleanly once we completely remove 32-bit support in two or three releases. > > How to do this? Most is straightforward: pointers are 32-bit, so they are already "compressed". We set up narrow Klass encoding such that `base = NULL` and `shift = 0`; the entire 32-bit address space is therefore our encoding range. We now behave exactly as we would behave on 64-bit in "unscaled" encoding mode, with encoding and decoding being no-ops. > > We also don't change the object layout; header stays the same - Klass is stored in the second word of the header as before, regardless of whether you interpret the word as `narrowKlass` or `Klass*`. > > We also don't need to make many changes in terms of storage. Today, on 32-bit systems, Klass structures live inside the non-Klass metaspace regions, which are scattered arbitrarily throughout the address space. On 64-bit, we need a class space to confine Klass structures to the encoding range; here, where the encoding range encompasses the entire address space, we don't need a class space. Note that should we ever consider supporting some form of compact object headers on 32-bit - god forbid that ever happens - we need to revise this design... This pull request has now been integrated. Changeset: 819de071 Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/819de071176623448ceba8065ed6f2aac40ae193 Stats: 196 lines in 17 files changed: 123 ins; 45 del; 28 mod 8363998: Implement Compressed Class Pointers for 32-bit Reviewed-by: rkennke, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/26491 From yzheng at openjdk.org Sun Aug 3 20:56:02 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Sun, 3 Aug 2025 20:56:02 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Could you please apply the following patch? diff --git a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java index 55d5492c2f8..fd08361f682 100644 --- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java +++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotMetaspaceConstantImpl.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2014, 2024, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2014, 2025, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it @@ -95,7 +95,7 @@ public boolean isCompressible() { } private boolean canBeStoredInCompressibleMetaSpace() { - if (metaspaceObject instanceof HotSpotResolvedJavaType t && !t.isArray()) { + if (!HotSpotVMConfig.config().useClassMetaspaceForAllClasses && metaspaceObject instanceof HotSpotResolvedJavaType t && !t.isArray()) { // As of JDK-8338526, interface and abstract types are not stored // in compressible metaspace. return !t.isInterface() && !t.isAbstract(); diff --git a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotVMConfig.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotVMConfig.java index bc2e121fe90..449b315e467 100644 --- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotVMConfig.java +++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotVMConfig.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it @@ -67,6 +67,8 @@ static String getHostArchitectureName() { final boolean useCompressedOops = getFlag("UseCompressedOops", Boolean.class); + final boolean useClassMetaspaceForAllClasses = getFlag("UseClassMetaspaceForAllClasses", Boolean.class); + final int objectAlignment = getFlag("ObjectAlignmentInBytes", Integer.class); final int klassOffsetInBytes = getFieldValue("CompilerToVM::Data::oopDesc_klass_offset_in_bytes", Integer.class, "int"); ------------- PR Comment: https://git.openjdk.org/jdk/pull/26579#issuecomment-3148692596 From liach at openjdk.org Mon Aug 4 00:24:15 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 4 Aug 2025 00:24:15 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs Message-ID: Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). ------------- Commit messages: - Years - Roll back Objects.rNN for now - Review remarks - Merge branch 'master' of https://github.com/openjdk/jdk into exp/requireNonNull-message-hacks - Bork - Try generify the NPE tracing API - Merge branch 'master' of https://github.com/openjdk/jdk into exp/requireNonNull-message-hacks - Perf concerns - Tests, MPs based prints - Objects rNN better message Changes: https://git.openjdk.org/jdk/pull/26600/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364588 Stats: 390 lines in 12 files changed: 335 ins; 13 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/26600.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26600/head:pull/26600 PR: https://git.openjdk.org/jdk/pull/26600 From liach at openjdk.org Mon Aug 4 00:24:15 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 4 Aug 2025 00:24:15 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 15:49:57 GMT, Chen Liang wrote: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). Thanks for the preliminary review. After some thinking, to avoid wasting review cycles, I am pulling the common infrastructure for requireNonNull and Checks::nullCheck into a JDK-internal API, and the changes here are mostly confined to runtime. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26600#issuecomment-3148805289 From dholmes at openjdk.org Mon Aug 4 00:24:15 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Aug 2025 00:24:15 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 15:49:57 GMT, Chen Liang wrote: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). src/hotspot/share/classfile/javaClasses.cpp line 3060: > 3058: } > 3059: > 3060: bool java_lang_Throwable::get_method_and_bci(oop throwable, Method** method, int* bci, int depth, bool hidden_ok) { Suggestion: bool java_lang_Throwable::get_method_and_bci(oop throwable, Method** method, int* bci, int depth, bool allow_hidden) { src/hotspot/share/interpreter/bytecodeUtils.hpp line 40: > 38: static bool get_NPE_message_at(outputStream* ss, Method* method, int bci); > 39: // NPE extended message. Return true if string is printed. > 40: static bool get_NPE_message_at(outputStream* ss, Method* method, int bci, int slot); Can you combine these with a default slot parameter (-1?) ? I'm suspecting not even though it might appear that once you have the slot you could call the new method, but the new method seems to expect only particular "kinds" of slot i.e. non-bytecode. But this is not documented at all. Please add comments to make it clear what the `slot` parameter is expected to be and how these two methods differ. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2248868942 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2248881502 From liach at openjdk.org Mon Aug 4 00:24:15 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 4 Aug 2025 00:24:15 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 21:05:53 GMT, David Holmes wrote: >> Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). > > src/hotspot/share/classfile/javaClasses.cpp line 3060: > >> 3058: } >> 3059: >> 3060: bool java_lang_Throwable::get_method_and_bci(oop throwable, Method** method, int* bci, int depth, bool hidden_ok) { > > Suggestion: > > bool java_lang_Throwable::get_method_and_bci(oop throwable, Method** method, int* bci, int depth, bool allow_hidden) { Done. > src/hotspot/share/interpreter/bytecodeUtils.hpp line 40: > >> 38: static bool get_NPE_message_at(outputStream* ss, Method* method, int bci); >> 39: // NPE extended message. Return true if string is printed. >> 40: static bool get_NPE_message_at(outputStream* ss, Method* method, int bci, int slot); > > Can you combine these with a default slot parameter (-1?) ? I'm suspecting not even though it might appear that once you have the slot you could call the new method, but the new method seems to expect only particular "kinds" of slot i.e. non-bytecode. But this is not documented at all. Please add comments to make it clear what the `slot` parameter is expected to be and how these two methods differ. Done, and documented. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2250200947 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2250201005 From david.holmes at oracle.com Mon Aug 4 02:20:38 2025 From: david.holmes at oracle.com (David Holmes) Date: Mon, 4 Aug 2025 12:20:38 +1000 Subject: RFC: 8347707: Standardise the use of os::snprintf and os::snprintf_checked In-Reply-To: <281f2b83-30ad-46c2-9a5b-c445d57d3784@oracle.com> References: <281f2b83-30ad-46c2-9a5b-c445d57d3784@oracle.com> Message-ID: <4384da3a-26af-441e-a1b1-ba33dbe2b90f@oracle.com> Seems there is no real interest in a pre-discussion so I will make the PR non-draft. David On 25/07/2025 11:54 am, David Holmes wrote: > This is a proposal to standardize on the use of os::snprintf and > os::snprintf_checked across the hotspot code base, and to disallow use > of the C library variants. (It does not touch use of jio_printf at all.) > > From: https://bugs.openjdk.org/browse/JDK-8347707 > > The platform `snprintf/vsnprintf` returns -1 on error, else if the > buffer is large enough returns the number of bytes written (excluding > the null byte), else (buffer is too small) the number of characters > (excluding the terminating null byte) which would have been written to > the final string if enough space had been available. Thus, a return > value of size or more means that the output was truncated. > > To provide a consistent approach to error handling and truncation > management, we provide `os::xxx` wrapper functions as described below > and forbid the use of the library `::vsnprintf` and `::snprintf`. > > The potential errors are, generally speaking, not something we should > encounter in our own well-written code: > > - encoding error: not applicable as we are not using extended character > sets > - invalid parameters (null buffers, specifying a limit > size of the > buffer [Windows], things of this nature) > - mal-formed formatting directives > - overflow error (POSIX) if the required buffer size exceeds INT_MAX (as > we return `int`). > > As these should simply never occur, we handle the checks for -1 at the > lowest-level (`os::vsnprintf`) with an assertion, and accompanying > precondition assertions. > > The potential clients of this API then fall into a number of camps: > > 1. Those who have sized their buffer correctly, don't need the return > value for subsequent use, and for whom truncation (if it were possible) > would be a programming error. > > For these clients we have `void os::snprintf_checked` - which returns > nothing and asserts on truncation. > > 2. Those who have sized their buffer correctly, but do need the return > value for subsequent operations (e.g. chains of `snprintf` where you > advance the buffer pointer based on previous writes), but again for whom > truncation should never happen. > > For these clients we have `os::snprintf`, but they have to add their own > assertion for no truncation. > > 3. Those who present a buffer but know that truncation is a possibility, > but don't need to do anything about it themselves, and for whom the > return value is of no use. > > These clients also use `os::snprintf_checked`. The truncation assertion > can be useful for guiding buffer sizing decisions, but in product mode > truncation is not an error. > > 4. Those who present a buffer but know that truncation is a possibility, > and either need to handle it themselves, or else need to use the return > value in subsequent operations. > > These clients are also directed to use `os::snprintf`. > > In summary we provide the following API: > - `[[nodiscard]] int os::vsnprintf` is the building block for the other > methods, it: > ? - asserts on precondition failures > ? - asserts on error > ? - guarantees null-termination in the case of unexpected errors (as > the standards are still unclear on that point > ? - is declared `[[nodiscard[]]` so that callers cannot ignore the > return value (they can explicitly cast to `void` to indicate they dn't > need it) > - `void os::snprintf_checked` > ? - calls `os::vnsprintf`` so asserts on errors > ? - asserts on truncation > - [[nodiscard]] int os::snprintf > ? - calls `os::vnsprintf`` so asserts on errors > > In terms of the effects on the existing code we: > - Change callers of `::snprintf`/`os::snprintf` that ignore the return > value and ensure the buffer is large enough to use `os::snprintf_checked` > ? - those that allow truncation to happen must use `os::snprintf`. > - Change all callers of `::snprintf`/`os::snprintf` that use the return > value to use `os::snprintf`, plus any additional assertions needed > - Change the 9 callers of `os::snprintf_checked` that do use the return > value, to use `os::snprintf` with their own assertions added > - Callers of `os::vnsprintf` are adjusted as needed > > There is a draft PR comprising of multiple dependent commits so that you > can view things in stages. There has been a suggestion to integrate this > in a staged way under different JBS issues to make reviewing easier. > There are 46 modified files. The bulk of changes replace calls to > snprintf/os::snprintf with calls to os::snprintf_checked. > > https://github.com/openjdk/jdk/pull/26470 > > Comments welcomed. > > Thanks, > David > From dholmes at openjdk.org Mon Aug 4 02:25:37 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Aug 2025 02:25:37 GMT Subject: RFR: 8347707: Standardise the use of os::snprintf and os::snprintf_checked Message-ID: TBD ------------- Commit messages: - Make os::snprintf "nodiscard" and adjust final callsites - Convert os::snprintf() to os::snprintf_checked() where appropriate. - Forbid C library snprintf - Change return-value using snprintf to explicit os::snprintf - Change raw snprintf to os::snprintf_checked, whereever truncation would not - Change os::snprintf_checked to be void function. - Mark os::vsnprintf as nodiscard and update the callsites. Changes: https://git.openjdk.org/jdk/pull/26470/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26470&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347707 Stats: 197 lines in 46 files changed: 15 ins; 5 del; 177 mod Patch: https://git.openjdk.org/jdk/pull/26470.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26470/head:pull/26470 PR: https://git.openjdk.org/jdk/pull/26470 From jjiang at openjdk.org Mon Aug 4 07:43:35 2025 From: jjiang at openjdk.org (John Jiang) Date: Mon, 4 Aug 2025 07:43:35 GMT Subject: RFR: 8364597: Replace THL A29 Limited with Tencent Message-ID: `THL A29 Limited` was a `Tencent` company but was dissolved. So, the copyright notes just use `Tencent` directly. ------------- Commit messages: - 8364597: Replace THL A29 Limited with Tencent Changes: https://git.openjdk.org/jdk/pull/26614/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26614&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364597 Stats: 67 lines in 58 files changed: 0 ins; 9 del; 58 mod Patch: https://git.openjdk.org/jdk/pull/26614.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26614/head:pull/26614 PR: https://git.openjdk.org/jdk/pull/26614 From dholmes at openjdk.org Mon Aug 4 07:44:06 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Aug 2025 07:44:06 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v23] In-Reply-To: <-SbK2XJEVxUmf9O-3-mnb3-ULRDnmxT-U3RUs_LDdVQ=.1e31ff6e-3455-4cbe-b323-cdecb94fe08e@github.com> References: <-SbK2XJEVxUmf9O-3-mnb3-ULRDnmxT-U3RUs_LDdVQ=.1e31ff6e-3455-4cbe-b323-cdecb94fe08e@github.com> Message-ID: On Fri, 1 Aug 2025 09:19:55 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value types for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static bool available_memory(size_t& value); >> static julong used_memory(); --> static bool used_memory(size_t& value); >> static julong free_memory(); --> static bool free_memory(size_t& value); >> static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); >> static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); >> static julong physical_memory(); --> static size_t physical_memory(size_t& value); >> >> >> The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, and unsuccessful call is logged. >> >> `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. >> >> Tested in GHA and Tiers 1-5. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: > > - 8357086: Fixed merge conflict > - 8357086: Removed extra line > - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD > - 8357086: Small fixes > - 8357086: Made physical_memory() return void > - 8357086: Fixed behavior of total_swap_space on Linux > - 8357086: Addressed reviewer's comments. > - 8357086: Fixed void conversion. > - 8357086: Addressed reviewer's comments > - 8357086: Addressed reviewer's comments > - ... and 14 more: https://git.openjdk.org/jdk/compare/d80b5c87...f9b2c6d8 src/hotspot/share/runtime/arguments.cpp line 1507: > 1505: void Arguments::set_heap_size() { > 1506: julong phys_mem; > 1507: size_t physical_mem_val = 0; I'm not seeing why you needed to introduce this as a temporary. ?? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2250670601 From duke at openjdk.org Mon Aug 4 08:23:00 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Mon, 4 Aug 2025 08:23:00 GMT Subject: Integrated: 8364519: Sort share/classfile includes In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 10:03:39 GMT, Francesco Andreuzzi wrote: > This PR sorts the includes in hotspot/share/classfile using SortIncludes.java. I'm also adding the directory to TestIncludesAreSorted. > > Passes tier1. This pull request has now been integrated. Changeset: 3387b319 Author: Francesco Andreuzzi Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/3387b3195c8f2a9faa3c93322f6e11ce2aad3e2b Stats: 48 lines in 20 files changed: 21 ins; 26 del; 1 mod 8364519: Sort share/classfile includes Reviewed-by: shade, ayang ------------- PR: https://git.openjdk.org/jdk/pull/26590 From duke at openjdk.org Mon Aug 4 08:43:48 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 4 Aug 2025 08:43:48 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static size_t physical_memory(size_t& value); > > > The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, an unsuccessful call is logged. > > `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: - 8357086: Addressed reviewer's comments - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs - 8357086: Fixed merge conflict - 8357086: Removed extra line - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD - 8357086: Small fixes - 8357086: Made physical_memory() return void - 8357086: Fixed behavior of total_swap_space on Linux - 8357086: Addressed reviewer's comments. - 8357086: Fixed void conversion. - ... and 16 more: https://git.openjdk.org/jdk/compare/57553ca1...a3898f43 ------------- Changes: https://git.openjdk.org/jdk/pull/25450/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=23 Stats: 250 lines in 23 files changed: 89 ins; 2 del; 159 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From mgronlun at openjdk.org Mon Aug 4 09:58:36 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Aug 2025 09:58:36 GMT Subject: [jdk25] RFR: 8364258: ThreadGroup constant pool serialization is not normalized Message-ID: 8364258: ThreadGroup constant pool serialization is not normalized ------------- Commit messages: - Backport 3bc449797eb59f9770d2a06d260b23b6efd5ff0f Changes: https://git.openjdk.org/jdk/pull/26618/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26618&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364258 Stats: 938 lines in 12 files changed: 431 ins; 482 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/26618.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26618/head:pull/26618 PR: https://git.openjdk.org/jdk/pull/26618 From ayang at openjdk.org Mon Aug 4 13:49:14 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 4 Aug 2025 13:49:14 GMT Subject: RFR: 8364254: Serial: Remove soft ref policy update in WhiteBox FullGC [v2] In-Reply-To: References: Message-ID: > Simple removing Serial specific logic in `WB_FullGC`, because Serial full-gc handles white-box full-gc already. > > Test: tier1-3 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into sgc-whitebox - sgc-whitebox ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26528/files - new: https://git.openjdk.org/jdk/pull/26528/files/d95ce7ed..7be7d3de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26528&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26528&range=00-01 Stats: 13614 lines in 375 files changed: 8956 ins; 3887 del; 771 mod Patch: https://git.openjdk.org/jdk/pull/26528.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26528/head:pull/26528 PR: https://git.openjdk.org/jdk/pull/26528 From dnsimon at openjdk.org Mon Aug 4 15:04:04 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 4 Aug 2025 15:04:04 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v5] In-Reply-To: References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> Message-ID: On Wed, 30 Jul 2025 19:25:34 GMT, Coleen Phillimore wrote: >> This change removes the intrinsic for getClassAccessFlagsRaw for reflection and initializes an rawAccessFlags field in java.lang.Class instead, that Java code can non-natively access. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore c2 optimization. src/hotspot/share/prims/jvm.cpp line 1744: > 1742: JVM_END > 1743: > 1744: JVM_ENTRY(jint, JVM_GetClassAccessFlags(JNIEnv *env, jclass cls)) What about the declaration in `jvm.h`? https://github.com/openjdk/jdk/blob/6c52b73465b0d0daeafc54c3c6cec3062bf490c5/src/hotspot/share/include/jvm.h#L610 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2251770425 From sangheki at openjdk.org Mon Aug 4 16:26:55 2025 From: sangheki at openjdk.org (Sangheon Kim) Date: Mon, 4 Aug 2025 16:26:55 GMT Subject: RFR: 8364254: Serial: Remove soft ref policy update in WhiteBox FullGC [v2] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 13:49:14 GMT, Albert Mingkun Yang wrote: >> Simple removing Serial specific logic in `WB_FullGC`, because Serial full-gc handles white-box full-gc already. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into sgc-whitebox > - sgc-whitebox LGTM ------------- Marked as reviewed by sangheki (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26528#pullrequestreview-3084933418 From kvn at openjdk.org Mon Aug 4 16:46:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 4 Aug 2025 16:46:53 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3084984596 From eastigeevich at openjdk.org Mon Aug 4 17:14:58 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 4 Aug 2025 17:14:58 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v3] In-Reply-To: <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> Message-ID: On Fri, 1 Aug 2025 16:05:42 GMT, Mikhail Ablakatov wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' > - Address review comments from @theRealAph and @eastig > - use relocInfo::relocType instead of TrampolineCallKind > - move rtype from keys to values; reduce the footprint of the patch > - Lift the vm.opt.TieredCompilation == null requirement from the tests > - Combine the two shared trampoline request hash tables > - 8359359: AArch64: share trampolines between static calls to the same method > > Modify the C2 compiler to share trampoline stubs between static calls > that resolve to the same callee method. Since the relocation target for > all static calls is initially set to the static call resolver stub, the > call's target alone cannot be used to distinguish between different > static method calls. Instead, trampoline stubs should be shared based > on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of > trampolines among static calls. However, due to imprecise log analysis, > the test currently passes even when trampolines are not shared. > Additionally, comments within the test suggest ambiguity regarding > whether it was intended to assess trampoline sharing for static calls > or runtime calls. To address these issues and eliminate ambiguity, this > patch renames and updates the existing test. Furthermore, a new test is > introduced, using the existing one as a foundation, to accurately > evaluate trampoline sharing for both static and runtime calls. src/hotspot/share/asm/codeBuffer.hpp line 545: > 543: > 544: typedef Pair> Offsets; > 545: typedef ResizeableResourceHashtable SharedTrampolineRequests; I think code can be simplified. For example, let's a Java method `A` call a static Java method `B` twice and a static Java method `C` three times. You will be having the following in `SharedTrampilneRequests`: key -> value {B} -> {relocInfo::static_call_type, {call_B_offset1, call_B_offset2}} {C} -> {relocInfo::static_call_type, {call_C_offset1, call_C_offset2, call_C_offset3}} When you iterate trampoline requests, you don't use the key for a destination if it is for a static call. Instead you use `SharedRuntime::get_resolve_static_call_stub()` as a destination. This means your requests don't differ from requests for `runtime_call_type` trampolines. `runtime_call_type` trampolines have final destinations but `static_call_type` have a resolver as a destination. So you actually have: key -> value (trampoline_id1} -> {get_resolve_static_call_stub, {call_B_offset1, call_B_offset2}} {trampoline_id2} -> {get_resolve_static_call_stub, {call_C_offset1, call_C_offset2, call_C_offset3}} {trampoilen_id3} -> {some_runtime_call, {call_RT_offset1, call_RT_offset2, call_RT_offset3}} IMO based on `SharedRuntime::resolve_helper`, we can have shared trampolines for `relocInfo::opt_virtual_call_type` and `relocInfo::virtual_call_type`. Their trampolines have the corresponding resolver set. I don't see we write anything specific to a call site into a trampoline. We can have the following: class SharedTrampolineRequest { LinkedListImpl _offsets; address _dest; #ifndef PRODUCT relocInfo::relocType; // This can be used in asserts to check a shared trampoline is for the same relocType #endif }; // A set of requests for shared trampolines. // We use a hash table to implement the set. // A trampoline id is a key in the table. // We don't store an id in SharedTrampolineRequest because it is only used for the table. typedef uintptr_t SharedTrampolineId; typedef ResizeableResourceHashtable SharedTrampolineRequests; You will need to modify `MacroAssembler::trampoline_call`: SharedTrampolineId tramp_id = (entry.rspec().type() == relocInfo::runtime_call_type) ? target : callee; code()->share_trampoline_for(tramp_id, entry.target(), offset()); New code for `share_trampoline_for(NOT_PRODUCT_ARG(relocInfo::relocType rel_type) tramp_id, target, offset )`: bool created; SharedTrampolineRequest tramp_req = _shared_trampoline_requests->put_if_absent(tramp_id, &created); if (created) { _shared_trampoline_requests->maybe_grow(); tramp_req->set_target(target); NOT_PRODUCT(tramp_req->set_reloc_type(rel_type)); } assert(tramp_req->reloc_type() == rel_type); tramp_req->offsets()->add(offset); _finalize_stubs = true; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2252117186 From eastigeevich at openjdk.org Mon Aug 4 17:42:57 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 4 Aug 2025 17:42:57 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v3] In-Reply-To: <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> Message-ID: <4j3wCRqhS-5nHML4TYbpQnfsuxcC3Rj9T49riuifGYw=.5fd293d8-db70-4461-ab0c-5aea817171f2@github.com> On Fri, 1 Aug 2025 16:05:42 GMT, Mikhail Ablakatov wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' > - Address review comments from @theRealAph and @eastig > - use relocInfo::relocType instead of TrampolineCallKind > - move rtype from keys to values; reduce the footprint of the patch > - Lift the vm.opt.TieredCompilation == null requirement from the tests > - Combine the two shared trampoline request hash tables > - 8359359: AArch64: share trampolines between static calls to the same method > > Modify the C2 compiler to share trampoline stubs between static calls > that resolve to the same callee method. Since the relocation target for > all static calls is initially set to the static call resolver stub, the > call's target alone cannot be used to distinguish between different > static method calls. Instead, trampoline stubs should be shared based > on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of > trampolines among static calls. However, due to imprecise log analysis, > the test currently passes even when trampolines are not shared. > Additionally, comments within the test suggest ambiguity regarding > whether it was intended to assess trampoline sharing for static calls > or runtime calls. To address these issues and eliminate ambiguity, this > patch renames and updates the existing test. Furthermore, a new test is > introduced, using the existing one as a foundation, to accurately > evaluate trampoline sharing for both static and runtime calls. This PR needs additional research and changes. ------------- Changes requested by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/25954#pullrequestreview-3085207313 From duke at openjdk.org Mon Aug 4 17:45:12 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Mon, 4 Aug 2025 17:45:12 GMT Subject: RFR: 8364638: Refactor and make accumulated GC CPU time code generic Message-ID: Hi all, This PR refactors the newly added GC CPU time code from [JDK-8359110](https://bugs.openjdk.org/browse/JDK-8359110). As a stepping-stone to enable consolidation of CPU time tracking in e.g. hsperf counters and GCTraceCPUTime and to have a unified interface for tracking CPU time of various components in Hotspot this code can be refactored. This PR introduces a new interface to retrieve CPU time for various Hotspot components and it currently supports: CPUTimeUsage::GC::total() // the sum of gc_threads(), vm_thread(), stringdedup() CPUTimeUsage::GC::gc_threads() CPUTimeUsage::GC::vm_thread() CPUTimeUsage::GC::stringdedup() CPUTimeUsage::Runtime::vm_thread() I moved `CPUTimeUsage` to `src/hotspot/share/services` since it seemed fitting as it housed similar performance tracking code like `RuntimeService`, as this is no longer a class that is only specific to GC. I also made a minor improvement in the CPU time logging during exit. Since `CPUTimeUsage` supports more components than just GC I changed the logging flag to from `gc,cpu` to `cpu` and created a detailed table: [12.517s][info][cpu] === CPU time Statistics ============================================================= [12.517s][info][cpu] CPUs [12.517s][info][cpu] s % utilized [12.517s][info][cpu] Process [12.517s][info][cpu] Total 175.7628 100.00 14.0 [12.517s][info][cpu] VM Thread 7.0000 3.98 0.6 [12.517s][info][cpu] Garbage Collection 72.0000 40.96 5.8 [12.517s][info][cpu] GC Threads 70.0000 39.83 5.6 [12.517s][info][cpu] VM Thread 1.0000 0.57 0.1 [12.518s][info][cpu] String Deduplication 0.0000 0.00 0.0 [12.518s][info][cpu] ===================================================================================== Additionally, if CPU time retrieval fails it should not be the caller's responsibility to log warnings as this would bloat the code unnecessarily. I've noticed that `os` does log a warning for some methods if they fail so I continued on this path. ------------- Commit messages: - Make GC CPU time framework generic Changes: https://git.openjdk.org/jdk/pull/26621/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364638 Stats: 229 lines in 11 files changed: 170 ins; 44 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/26621.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26621/head:pull/26621 PR: https://git.openjdk.org/jdk/pull/26621 From duke at openjdk.org Mon Aug 4 17:53:59 2025 From: duke at openjdk.org (duke) Date: Mon, 4 Aug 2025 17:53:59 GMT Subject: RFR: 8360559: Optimize Math.sinh for x86 64 bit platforms [v3] In-Reply-To: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> References: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> Message-ID: On Thu, 31 Jul 2025 21:32:16 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.sinh() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. >> >> The command to run all range specific micro-benchmarks is posted below. >> >> `make test TEST="micro:SinhPerf.SinhPerfRanges"` >> >> The results of all tests posted below were captured with an [Intel? Xeon 8488C](https://advisor.cloudzero.com/aws/ec2/r7i.metal-24xl) using [OpenJDK v26-b4](https://github.com/openjdk/jdk/releases/tag/jdk-26%2B4) as the baseline version. >> >> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides an an average uplift of 64% when input values fall into the middle three ranges where heavy computation is required. However, very small inputs and very large inputs show drops of 74% and 66% respectively. >> >> | Input range(s) | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup | >> | :------------------------------------: | :-------------------------------: | :--------------------------------: | :--------: | >> | [-2^(-28), 2^(-28)] | 844160 | 216029 | 0.26x | >> | [-22, -2^(-28)], [2^(-28), 22] | 81662 | 157351 | 1.93x | >> | [-709.78, -22], [22, 709.78] | 119075 | 167635 | 1.41x | >> | [-710.48, -709.78], [709.78, 710.48] | 111636 | 177125 | 1.59x | >> | (-INF, -710.48], [710.48, INF) | 959296 | 313839 | 0.33x | >> >> Finally, the `jtreg:test/jdk/java/lang/Math/HyperbolicTests.java` test passed with the changes. > > Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: > > Make sure xmm1 is initialized before potential use in L_2TAG_PACKET_3_0_2 @missa-prime Your change (at version 3ab7ab3e752991e2eb7f43af68380f1e66be9bec) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26152#issuecomment-3151813299 From asmehra at openjdk.org Mon Aug 4 18:18:58 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 4 Aug 2025 18:18:58 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using FormatBuffer [v4] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 29 Jul 2025 07:19:31 GMT, Johan Sj?len wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> More compile failure fix >> >> Signed-off-by: Ashutosh Mehra > > It seems like we can just use a `stringStream` instead and not have to deal with any overflow of some buffer, since we strdup at the end. @jdksjolen I updated the code as per your suggestion. Please review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26515#issuecomment-3137144691 From duke at openjdk.org Mon Aug 4 18:24:09 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 4 Aug 2025 18:24:09 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v23] In-Reply-To: References: <-SbK2XJEVxUmf9O-3-mnb3-ULRDnmxT-U3RUs_LDdVQ=.1e31ff6e-3455-4cbe-b323-cdecb94fe08e@github.com> Message-ID: On Mon, 4 Aug 2025 07:41:03 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: >> >> - 8357086: Fixed merge conflict >> - 8357086: Removed extra line >> - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD >> - 8357086: Small fixes >> - 8357086: Made physical_memory() return void >> - 8357086: Fixed behavior of total_swap_space on Linux >> - 8357086: Addressed reviewer's comments. >> - 8357086: Fixed void conversion. >> - 8357086: Addressed reviewer's comments >> - 8357086: Addressed reviewer's comments >> - ... and 14 more: https://git.openjdk.org/jdk/compare/d80b5c87...f9b2c6d8 > > src/hotspot/share/runtime/arguments.cpp line 1507: > >> 1505: void Arguments::set_heap_size() { >> 1506: julong phys_mem; >> 1507: size_t physical_mem_val = 0; > > I'm not seeing why you needed to introduce this as a temporary. ?? This is a leftover from the latest change `os::physical_memory(size_t& val) -> size_t os::physical_memory()`. Addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2250811229 From coleenp at openjdk.org Mon Aug 4 18:41:20 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Aug 2025 18:41:20 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace [v2] In-Reply-To: References: Message-ID: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Apply Yudi patch for graal. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26579/files - new: https://git.openjdk.org/jdk/pull/26579/files/79399d5d..b9e2e166 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26579&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26579&range=00-01 Stats: 5 lines in 2 files changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26579.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26579/head:pull/26579 PR: https://git.openjdk.org/jdk/pull/26579 From yzheng at openjdk.org Mon Aug 4 18:41:20 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 4 Aug 2025 18:41:20 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace [v2] In-Reply-To: References: Message-ID: <_xyH5Qpo3w84dlxoEDONYOSiBRIFAkkibBxrQTwd6QE=.7be2eecc-f390-4cdc-9bd0-de35d84f0242@github.com> On Mon, 4 Aug 2025 18:34:52 GMT, Coleen Phillimore wrote: >> This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. >> Tested with tier1-4 with the flag off, and 1-4 with the flag on. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Apply Yudi patch for graal. LGTM ------------- Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3085349076 From kvn at openjdk.org Mon Aug 4 18:41:21 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 4 Aug 2025 18:41:21 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 18:34:52 GMT, Coleen Phillimore wrote: >> This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. >> Tested with tier1-4 with the flag off, and 1-4 with the flag on. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Apply Yudi patch for graal. Re-approved. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3085351452 From shade at openjdk.org Mon Aug 4 18:44:31 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Aug 2025 18:44:31 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 18:41:20 GMT, Coleen Phillimore wrote: >> This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. >> Tested with tier1-4 with the flag off, and 1-4 with the flag on. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Apply Yudi patch for graal. Let's go! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26579#pullrequestreview-3085368923 From missa at openjdk.org Mon Aug 4 18:51:28 2025 From: missa at openjdk.org (Mohamed Issa) Date: Mon, 4 Aug 2025 18:51:28 GMT Subject: Integrated: 8360559: Optimize Math.sinh for x86 64 bit platforms In-Reply-To: References: Message-ID: On Mon, 7 Jul 2025 03:05:15 GMT, Mohamed Issa wrote: > The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.sinh() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. > > The command to run all range specific micro-benchmarks is posted below. > > `make test TEST="micro:SinhPerf.SinhPerfRanges"` > > The results of all tests posted below were captured with an [Intel? Xeon 8488C](https://advisor.cloudzero.com/aws/ec2/r7i.metal-24xl) using [OpenJDK v26-b4](https://github.com/openjdk/jdk/releases/tag/jdk-26%2B4) as the baseline version. > > For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides an an average uplift of 64% when input values fall into the middle three ranges where heavy computation is required. However, very small inputs and very large inputs show drops of 74% and 66% respectively. > > | Input range(s) | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup | > | :------------------------------------: | :-------------------------------: | :--------------------------------: | :--------: | > | [-2^(-28), 2^(-28)] | 844160 | 216029 | 0.26x | > | [-22, -2^(-28)], [2^(-28), 22] | 81662 | 157351 | 1.93x | > | [-709.78, -22], [22, 709.78] | 119075 | 167635 | 1.41x | > | [-710.48, -709.78], [709.78, 710.48] | 111636 | 177125 | 1.59x | > | (-INF, -710.48], [710.48, INF) | 959296 | 313839 | 0.33x | > > Finally, the `jtreg:test/jdk/java/lang/Math/HyperbolicTests.java` test passed with the changes. This pull request has now been integrated. Changeset: 05f8a6fc Author: Mohamed Issa Committer: Sandhya Viswanathan URL: https://git.openjdk.org/jdk/commit/05f8a6fca87d472a80e5952ddc90d8fa6589c75c Stats: 742 lines in 26 files changed: 739 ins; 0 del; 3 mod 8360559: Optimize Math.sinh for x86 64 bit platforms Reviewed-by: sviswanathan, sparasa ------------- PR: https://git.openjdk.org/jdk/pull/26152 From rriggs at openjdk.org Mon Aug 4 19:11:16 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Mon, 4 Aug 2025 19:11:16 GMT Subject: RFR: 8360559: Optimize Math.sinh for x86 64 bit platforms [v3] In-Reply-To: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> References: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> Message-ID: On Thu, 31 Jul 2025 21:32:16 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.sinh() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. >> >> The command to run all range specific micro-benchmarks is posted below. >> >> `make test TEST="micro:SinhPerf.SinhPerfRanges"` >> >> The results of all tests posted below were captured with an [Intel? Xeon 8488C](https://advisor.cloudzero.com/aws/ec2/r7i.metal-24xl) using [OpenJDK v26-b4](https://github.com/openjdk/jdk/releases/tag/jdk-26%2B4) as the baseline version. >> >> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides an an average uplift of 64% when input values fall into the middle three ranges where heavy computation is required. However, very small inputs and very large inputs show drops of 74% and 66% respectively. >> >> | Input range(s) | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup | >> | :------------------------------------: | :-------------------------------: | :--------------------------------: | :--------: | >> | [-2^(-28), 2^(-28)] | 844160 | 216029 | 0.26x | >> | [-22, -2^(-28)], [2^(-28), 22] | 81662 | 157351 | 1.93x | >> | [-709.78, -22], [22, 709.78] | 119075 | 167635 | 1.41x | >> | [-710.48, -709.78], [709.78, 710.48] | 111636 | 177125 | 1.59x | >> | (-INF, -710.48], [710.48, INF) | 959296 | 313839 | 0.33x | >> >> Finally, the `jtreg:test/jdk/java/lang/Math/HyperbolicTests.java` test passed with the changes. > > Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: > > Make sure xmm1 is initialized before potential use in L_2TAG_PACKET_3_0_2 This seems to be failing in the CI on the build of: linux-x64-debug-nopch [2025-08-04T19:02:29,373Z] /......../workspace/open/src/hotspot/cpu/x86/stubGenerator_x86_64_sinh.cpp:293:27: error: 'stub_id' was not declared in this scope; did you mean 'StubId'? [2025-08-04T19:02:29,373Z] 293 | StubCodeMark mark(this, stub_id); [2025-08-04T19:02:29,373Z] | ^~~~~~~ [2025-08-04T19:02:29,373Z] | StubId [2025-08-04T19:02:29,373Z] ------------- PR Comment: https://git.openjdk.org/jdk/pull/26152#issuecomment-3152016875 From duke at openjdk.org Mon Aug 4 19:34:01 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Mon, 4 Aug 2025 19:34:01 GMT Subject: RFR: 8364638: Refactor and make accumulated GC CPU time code generic [v2] In-Reply-To: References: Message-ID: > Hi all, > > This PR refactors the newly added GC CPU time code from [JDK-8359110](https://bugs.openjdk.org/browse/JDK-8359110). > > As a stepping-stone to enable consolidation of CPU time tracking in e.g. hsperf counters and GCTraceCPUTime and to have a unified interface for tracking CPU time of various components in Hotspot this code can be refactored. This PR introduces a new interface to retrieve CPU time for various Hotspot components and it currently supports: > > CPUTimeUsage::GC::total() // the sum of gc_threads(), vm_thread(), stringdedup() > > CPUTimeUsage::GC::gc_threads() > CPUTimeUsage::GC::vm_thread() > CPUTimeUsage::GC::stringdedup() > > CPUTimeUsage::Runtime::vm_thread() > > > I moved `CPUTimeUsage` to `src/hotspot/share/services` since it seemed fitting as it housed similar performance tracking code like `RuntimeService`, as this is no longer a class that is only specific to GC. > > I also made a minor improvement in the CPU time logging during exit. Since `CPUTimeUsage` supports more components than just GC I changed the logging flag to from `gc,cpu` to `cpu` and created a detailed table: > > > [12.517s][info][cpu] === CPU time Statistics ============================================================= > [12.517s][info][cpu] CPUs > [12.517s][info][cpu] s % utilized > [12.517s][info][cpu] Process > [12.517s][info][cpu] Total 175.7628 100.00 14.0 > [12.517s][info][cpu] VM Thread 7.0000 3.98 0.6 > [12.517s][info][cpu] Garbage Collection 72.0000 40.96 5.8 > [12.517s][info][cpu] GC Threads 70.0000 39.83 5.6 > [12.517s][info][cpu] VM Thread 1.0000 0.57 0.1 > [12.518s][info][cpu] String Deduplication 0.0000 0.00 0.0 > [12.518s][info][cpu] ===================================================================================== > > > Additionally, if CPU time retrieval fails it should not be the caller's responsibility to log warnings as this would bloat the code unnecessarily. I've noticed that `os` does log a warning for some methods if they fail so I continued on this path. Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: Remove assert which may not hold in tests that exists very fast It takes some time before the OS updates the underlying counters so process CPU time may be out of sync with regards to magnitude order compared to thread CPU time ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26621/files - new: https://git.openjdk.org/jdk/pull/26621/files/4f74c2bd..6a6e3996 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26621.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26621/head:pull/26621 PR: https://git.openjdk.org/jdk/pull/26621 From missa at openjdk.org Mon Aug 4 19:50:11 2025 From: missa at openjdk.org (Mohamed Issa) Date: Mon, 4 Aug 2025 19:50:11 GMT Subject: RFR: 8360559: Optimize Math.sinh for x86 64 bit platforms [v3] In-Reply-To: References: <_c3ITcxrb66Z6Bq4dJc5uHhPFawXXfhyOhscoGicO3k=.2214b54a-c875-41be-936c-550efd649b91@github.com> Message-ID: On Mon, 4 Aug 2025 19:06:10 GMT, Roger Riggs wrote: > This seems to be failing in the CI on the build of: linux-x64-debug-nopch > > ``` > [2025-08-04T19:02:29,373Z] /......../workspace/open/src/hotspot/cpu/x86/stubGenerator_x86_64_sinh.cpp:293:27: error: 'stub_id' was not declared in this scope; did you mean 'StubId'? > [2025-08-04T19:02:29,373Z] 293 | StubCodeMark mark(this, stub_id); > [2025-08-04T19:02:29,373Z] | ^~~~~~~ > [2025-08-04T19:02:29,373Z] | StubId > [2025-08-04T19:02:29,373Z] > ``` Thanks for notification. It looks like an incorrect type declaration. I'll submit a fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26152#issuecomment-3152125201 From duke at openjdk.org Mon Aug 4 19:53:24 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Mon, 4 Aug 2025 19:53:24 GMT Subject: RFR: 8364638: Refactor and make accumulated GC CPU time code generic [v3] In-Reply-To: References: Message-ID: > Hi all, > > This PR refactors the newly added GC CPU time code from [JDK-8359110](https://bugs.openjdk.org/browse/JDK-8359110). > > As a stepping-stone to enable consolidation of CPU time tracking in e.g. hsperf counters and GCTraceCPUTime and to have a unified interface for tracking CPU time of various components in Hotspot this code can be refactored. This PR introduces a new interface to retrieve CPU time for various Hotspot components and it currently supports: > > CPUTimeUsage::GC::total() // the sum of gc_threads(), vm_thread(), stringdedup() > > CPUTimeUsage::GC::gc_threads() > CPUTimeUsage::GC::vm_thread() > CPUTimeUsage::GC::stringdedup() > > CPUTimeUsage::Runtime::vm_thread() > > > I moved `CPUTimeUsage` to `src/hotspot/share/services` since it seemed fitting as it housed similar performance tracking code like `RuntimeService`, as this is no longer a class that is only specific to GC. > > I also made a minor improvement in the CPU time logging during exit. Since `CPUTimeUsage` supports more components than just GC I changed the logging flag to from `gc,cpu` to `cpu` and created a detailed table: > > > [12.517s][info][cpu] === CPU time Statistics ============================================================= > [12.517s][info][cpu] CPUs > [12.517s][info][cpu] s % utilized > [12.517s][info][cpu] Process > [12.517s][info][cpu] Total 175.7628 100.00 14.0 > [12.517s][info][cpu] VM Thread 7.0000 3.98 0.6 > [12.517s][info][cpu] Garbage Collection 72.0000 40.96 5.8 > [12.517s][info][cpu] GC Threads 70.0000 39.83 5.6 > [12.517s][info][cpu] VM Thread 1.0000 0.57 0.1 > [12.518s][info][cpu] String Deduplication 0.0000 0.00 0.0 > [12.518s][info][cpu] ===================================================================================== > > > Additionally, if CPU time retrieval fails it should not be the caller's responsibility to log warnings as this would bloat the code unnecessarily. I've noticed that `os` does log a warning for some methods if they fail so I continued on this path. Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: Prefer returning 0 over +/-inf,nan ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26621/files - new: https://git.openjdk.org/jdk/pull/26621/files/6a6e3996..216ba811 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26621.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26621/head:pull/26621 PR: https://git.openjdk.org/jdk/pull/26621 From coleenp at openjdk.org Mon Aug 4 20:15:19 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Aug 2025 20:15:19 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 18:41:20 GMT, Coleen Phillimore wrote: >> This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. >> Tested with tier1-4 with the flag off, and 1-4 with the flag on. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Apply Yudi patch for graal. GHA looks clean ------------- PR Comment: https://git.openjdk.org/jdk/pull/26579#issuecomment-3152203209 From coleenp at openjdk.org Mon Aug 4 20:15:20 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Aug 2025 20:15:20 GMT Subject: Integrated: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:13:30 GMT, Coleen Phillimore wrote: > This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. > Tested with tier1-4 with the flag off, and 1-4 with the flag on. This pull request has now been integrated. Changeset: da3a5da8 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/da3a5da81bc1d6fe1e47e3a4e65bf390ee1d39a0 Stats: 12 lines in 5 files changed: 6 ins; 0 del; 6 mod 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace Reviewed-by: shade, kvn, yzheng, stuefe, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/26579 From coleenp at openjdk.org Mon Aug 4 21:20:11 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Aug 2025 21:20:11 GMT Subject: RFR: 8343218: Add option to disable allocating interface and abstract classes in non-class metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 18:41:20 GMT, Coleen Phillimore wrote: >> This change introduces a diagnostic flag to disable the change to allocate interface and abstract Klass metadata in the non-class metaspace. Testing for JDK 25 has completed with this feature in place, but there may be cases where we should disable this. A future change will be to remove this code. >> Tested with tier1-4 with the flag off, and 1-4 with the flag on. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Apply Yudi patch for graal. Thank you Aleksey,Thomas, David, Vladimir and Yudi. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26579#issuecomment-3152434736 From dlong at openjdk.org Mon Aug 4 21:22:57 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 4 Aug 2025 21:22:57 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v4] In-Reply-To: References: Message-ID: > This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. Dean Long has updated the pull request incrementally with one additional commit since the last revision: skip icache flush if nothing changed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26399/files - new: https://git.openjdk.org/jdk/pull/26399/files/a06b3446..840750ab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26399&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26399&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26399.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26399/head:pull/26399 PR: https://git.openjdk.org/jdk/pull/26399 From dlong at openjdk.org Mon Aug 4 21:26:22 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 4 Aug 2025 21:26:22 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v5] In-Reply-To: References: Message-ID: > This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. Dean Long has updated the pull request incrementally with one additional commit since the last revision: one unconditional release should be enough ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26399/files - new: https://git.openjdk.org/jdk/pull/26399/files/840750ab..d9e93db3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26399&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26399&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26399.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26399/head:pull/26399 PR: https://git.openjdk.org/jdk/pull/26399 From dlong at openjdk.org Mon Aug 4 21:26:23 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 4 Aug 2025 21:26:23 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3] In-Reply-To: References: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> Message-ID: <31qxrkcScTxkk9gjEGGuEZEyCnpLe9VapvbgIpOTpow=.5a8777ab-35bf-48e2-8900-3d6ed5c19cf7@github.com> On Fri, 1 Aug 2025 21:38:11 GMT, Martin Doerr wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix PPC64 > > src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 84: > >> 82: nativeMovRegMem_at(new_mov_instr.buf)->set_offset(new_value, false /* no icache flush */); >> 83: // Swap in the new value >> 84: uint64_t v = Atomic::cmpxchg(instr, old_mov_instr.u64, new_mov_instr.u64, memory_order_release); > > We have `OrderAccess::release()` above, so `memory_order_release` looks redundant. Shouldn't we use `memory_order_relaxed`, here? I think you are right. But your question about release is making me wonder if we need acquire as well. For example if two threads are racing to disarm, is there a memory visibility problem if we do not use acquire for the CAS, or if we did the release only on a successful CAS? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2252604613 From mdoerr at openjdk.org Mon Aug 4 21:50:09 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 4 Aug 2025 21:50:09 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3] In-Reply-To: <31qxrkcScTxkk9gjEGGuEZEyCnpLe9VapvbgIpOTpow=.5a8777ab-35bf-48e2-8900-3d6ed5c19cf7@github.com> References: <_8Z-PXaFqGay1pHqcJmeWXrOFv4QQVqnJG2RuZ7rzTk=.34cc6ecb-e189-461c-971b-f59f899372f5@github.com> <31qxrkcScTxkk9gjEGGuEZEyCnpLe9VapvbgIpOTpow=.5a8777ab-35bf-48e2-8900-3d6ed5c19cf7@github.com> Message-ID: On Mon, 4 Aug 2025 21:22:12 GMT, Dean Long wrote: >> src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 84: >> >>> 82: nativeMovRegMem_at(new_mov_instr.buf)->set_offset(new_value, false /* no icache flush */); >>> 83: // Swap in the new value >>> 84: uint64_t v = Atomic::cmpxchg(instr, old_mov_instr.u64, new_mov_instr.u64, memory_order_release); >> >> We have `OrderAccess::release()` above, so `memory_order_release` looks redundant. Shouldn't we use `memory_order_relaxed`, here? > > I think you are right. But your question about release is making me wonder if we need acquire as well. For example if two threads are racing to disarm, is there a memory visibility problem if we do not use acquire for the CAS, or if we do the release only on a successful CAS on the other platforms. Correct. The acquire barrier is at the end of the nmethod entry barrier: https://github.com/openjdk/jdk/blob/f96b6bcd4ddbb1d0e0a76d9f4e3b43bec20dcb7a/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp#L203 It's not needed if we use a GC with `stw_instruction_and_data_patch`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2252641681 From dcubed at openjdk.org Mon Aug 4 23:31:09 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 4 Aug 2025 23:31:09 GMT Subject: RFR: 8364638: Refactor and make accumulated GC CPU time code generic [v3] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 19:53:24 GMT, Jonas Norlinder wrote: >> Hi all, >> >> This PR refactors the newly added GC CPU time code from [JDK-8359110](https://bugs.openjdk.org/browse/JDK-8359110). >> >> As a stepping-stone to enable consolidation of CPU time tracking in e.g. hsperf counters and GCTraceCPUTime and to have a unified interface for tracking CPU time of various components in Hotspot this code can be refactored. This PR introduces a new interface to retrieve CPU time for various Hotspot components and it currently supports: >> >> CPUTimeUsage::GC::total() // the sum of gc_threads(), vm_thread(), stringdedup() >> >> CPUTimeUsage::GC::gc_threads() >> CPUTimeUsage::GC::vm_thread() >> CPUTimeUsage::GC::stringdedup() >> >> CPUTimeUsage::Runtime::vm_thread() >> >> >> I moved `CPUTimeUsage` to `src/hotspot/share/services` since it seemed fitting as it housed similar performance tracking code like `RuntimeService`, as this is no longer a class that is only specific to GC. >> >> I also made a minor improvement in the CPU time logging during exit. Since `CPUTimeUsage` supports more components than just GC I changed the logging flag to from `gc,cpu` to `cpu` and created a detailed table: >> >> >> [12.517s][info][cpu] === CPU time Statistics ============================================================= >> [12.517s][info][cpu] CPUs >> [12.517s][info][cpu] s % utilized >> [12.517s][info][cpu] Process >> [12.517s][info][cpu] Total 175.7628 100.00 14.0 >> [12.517s][info][cpu] VM Thread 7.0000 3.98 0.6 >> [12.517s][info][cpu] Garbage Collection 72.0000 40.96 5.8 >> [12.517s][info][cpu] GC Threads 70.0000 39.83 5.6 >> [12.517s][info][cpu] VM Thread 1.0000 0.57 0.1 >> [12.518s][info][cpu] String Deduplication 0.0000 0.00 0.0 >> [12.518s][info][cpu] ===================================================================================== >> >> >> Additionally, if CPU time retrieval fails it should not be the caller's responsibility to log warnings as this would bloat the code unnecessarily. I've noticed that `os` does log a warning for some methods if they fail so I continued on this path. > > Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: > > Prefer returning 0 over +/-inf,nan Seems like JVM/TI folks will be interested also. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26621#issuecomment-3152782164 From dholmes at openjdk.org Tue Aug 5 03:03:04 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Aug 2025 03:03:04 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v6] In-Reply-To: <4P-hI1f0bLknqyybmbVFKn4FbpP91ClS_nlpmgrvnEs=.0b2a81f2-0fdc-4b6e-9dda-d7ad0e38ac1f@github.com> References: <4P-hI1f0bLknqyybmbVFKn4FbpP91ClS_nlpmgrvnEs=.0b2a81f2-0fdc-4b6e-9dda-d7ad0e38ac1f@github.com> Message-ID: <6JSFYVujOVVppyA4tR_1Bw8C5K0J-50FBpqUE2dWahc=.a5aae512-3581-4dac-887c-09bfb125154e@github.com> On Wed, 30 Jul 2025 09:44:41 GMT, Joel Sikstr?m wrote: >> Hello, >> >> PROPERFMT is defined as the format string "%zu%s", which expects a size_t as input argument. When used in combination with PROPERFMTARGS, which uses the templated byte_size_in_proper_units, the byte size may not be size_t if the input is some other type. >> >> To minimize confusion, PROPERFMTARGS should always use the size_t template specilization of byte_size_in_proper_units, to match PROPERFMT. Places that use byte_size_in_proper_units with other types can still use it, but should use their own format strings instead of PROPERFMT. >> >> ProcSmapsSummary::print_on in memMapPrinter_macosx is the only place that uses PROPERFMTARGS with a type that is not size_t. I have changed those places to use the expanded version of the macro, which uses the templated version of byte_size_in_proper_unit instead. >> >> Testing: >> * Oracle's tier1-2 > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Revert extra whitespace Sorry for the delay. This seems fine to me. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25975#pullrequestreview-3086352078 From dlong at openjdk.org Tue Aug 5 04:00:09 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 5 Aug 2025 04:00:09 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> Message-ID: On Tue, 1 Jul 2025 09:11:32 GMT, Manuel H?ssig wrote: >> This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. >> >> The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. >> >> Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. >> >> Testing: >> - [x] Github Actions >> - [x] tier1, tier2 on all platforms >> - [x] tier3, tier4 and Oracle internal testing on Linux fastdebug >> - [x] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) > > Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8308094-timeout > - Fix SIGALRM test > - Add timeout functionality to compiler threads src/hotspot/share/compiler/compileBroker.cpp line 236: > 234: { > 235: MutexLocker notifier(thread, CompileTaskWait_lock); > 236: thread->timeout_disarm(); Is holding the lock above important for disasming? If not, can we move the disarms from the if/else branches and do it unconditionally before the if? src/hotspot/share/compiler/compilerThread.cpp line 97: > 95: switch (signo) { > 96: case TIMEOUT_SIGNAL: { > 97: assert(!Atomic::load_acquire(&_timeout_armed), "compile task timed out"); Why do we need acquire? Only the current thread is ever going to be looking at this value, right? src/hotspot/share/compiler/compilerThread.cpp line 157: > 155: // Start the timer. > 156: timer_settime(_timeout_timer, 0, &its, nullptr); > 157: Atomic::release_store(&_timeout_armed, (bool) true); Same questions about release. Are other threads reading/writing this value? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2253044899 PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2253045931 PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2253046604 From dlong at openjdk.org Tue Aug 5 04:07:04 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 5 Aug 2025 04:07:04 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> Message-ID: On Tue, 1 Jul 2025 09:11:32 GMT, Manuel H?ssig wrote: >> This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. >> >> The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. >> >> Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. >> >> Testing: >> - [x] Github Actions >> - [x] tier1, tier2 on all platforms >> - [x] tier3, tier4 and Oracle internal testing on Linux fastdebug >> - [x] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) > > Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8308094-timeout > - Fix SIGALRM test > - Add timeout functionality to compiler threads This looks correct, but would it be possible to move the Linux-specific code out of src/hotspot/share? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26023#issuecomment-3153200663 From dholmes at openjdk.org Tue Aug 5 05:19:17 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Aug 2025 05:19:17 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 08:43:48 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value types for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static bool available_memory(size_t& value); >> static julong used_memory(); --> static bool used_memory(size_t& value); >> static julong free_memory(); --> static bool free_memory(size_t& value); >> static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); >> static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); >> static julong physical_memory(); --> static size_t physical_memory(size_t& value); >> >> >> The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, an unsuccessful call is logged. >> >> `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. >> >> Tested in GHA and Tiers 1-5. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: > > - 8357086: Addressed reviewer's comments > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed merge conflict > - 8357086: Removed extra line > - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD > - 8357086: Small fixes > - 8357086: Made physical_memory() return void > - 8357086: Fixed behavior of total_swap_space on Linux > - 8357086: Addressed reviewer's comments. > - 8357086: Fixed void conversion. > - ... and 16 more: https://git.openjdk.org/jdk/compare/57553ca1...a3898f43 Overall I'm okay with the way the API has turned out here. Naturally it would be better if no errors were possible, but I think containers prevent that from being the case. I have a lot of stylistic comments, many pre-existing, and some notes for future cleanups. But happy to hit approve as a vote of confidence. Thanks src/hotspot/os/bsd/os_bsd.cpp line 137: > 135: > 136: bool os::available_memory(size_t& value) { > 137: return Bsd::available_memory(value); Just an aside for a future cleanup, but there is really no reason to have the indirection through e.g. `Bsd::available_memory`. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 675: > 673: julong phys_mem = static_cast(os::Linux::physical_memory()); > 674: log_trace(os, container)("total physical memory: " JULONG_FORMAT, phys_mem); > 675: jlong mem_limit = contrl->controller()->read_memory_limit_in_bytes(phys_mem); Another aside for a future cleanup: the container methods should be taking size_t not julong. src/hotspot/os/linux/os_linux.cpp line 297: > 295: if (OSContainer::is_containerized() && OSContainer::memory_and_swap_limit_in_bytes() > 0) { > 296: if (OSContainer::memory_limit_in_bytes() > 0) { > 297: value = static_cast(OSContainer::memory_and_swap_limit_in_bytes() - OSContainer::memory_limit_in_bytes()); Pre-existing but we should really only calls these functions once and store the result in a local. src/hotspot/os/linux/os_linux.cpp line 328: > 326: bool host_free_swap_ok = host_free_swap_f(host_free_swap); > 327: if (!total_swap_space_ok || !host_free_swap_ok) { > 328: assert(false, "sysinfo failed ? "); Pre-existing: the assert should be pushed down to where the `sysinfo` calls are so you could see which one failed - not that they can actually fail if called correctly. src/hotspot/os/linux/os_linux.cpp line 330: > 328: assert(false, "sysinfo failed ? "); > 329: return false; > 330: } Suggestion: size_t total_swap_space = 0; size_t host_free_swap = 0; if (!os::total_swap_space(total_swap_space) || !host_free_swap_f(host_free_swap)) { assert(false, "sysinfo failed ? "); return false; } src/hotspot/os/windows/os_windows.cpp line 870: > 868: } else { > 869: return false; > 870: } Suggestion: BOOL res = GlobalMemoryStatusEx(&ms); if (res == TRUE) { value = static_cast(ms.ullAvailPhys); } return res == TRUE; } Unfortunately `BOOL` is an implicit boolean. This applies to all the Windows functions. You could also assert that `res == TRUE` and report `GetlastError` if not. src/hotspot/share/jfr/jni/jfrJniMethod.cpp line 418: > 416: #else > 417: size_t phys_mem = os::physical_memory(); > 418: return static_cast(phys_mem); Suggestion: return static_cast(os::physical_memory()); src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 532: > 530: TRACE_REQUEST_FUNC(PhysicalMemory) { > 531: size_t phys_mem = os::physical_memory(); > 532: u8 totalPhysicalMemory = static_cast(phys_mem); Suggestion: u8 totalPhysicalMemory = static_cast(os::physical_memory()); src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 536: > 534: event.set_totalSize(totalPhysicalMemory); > 535: size_t avail_mem = 0; > 536: (void)os::available_memory(avail_mem); Suggestion: size_t avail_mem = 0; // Return value ignored - defaulting to 0 on failure. (void)os::available_memory(avail_mem); src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 547: > 545: event.set_totalSize(static_cast(total_swap_space)); > 546: size_t free_swap_space = 0; > 547: (void)os::free_swap_space(free_swap_space); Suggestion: size_t total_swap_space = 0; // Return value ignored - defaulting to 0 on failure. (void)os::total_swap_space(total_swap_space); event.set_totalSize(static_cast(total_swap_space)); size_t free_swap_space = 0; // Return value ignored - defaulting to 0 on failure. (void)os::free_swap_space(free_swap_space); src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1362: > 1360: InstanceKlass* the_class = get_ik(_class_defs[i].klass); > 1361: size_t avail_mem = 0; > 1362: (void)os::available_memory(avail_mem); Suggestion: size_t avail_mem = 0; // Return value ignored - defaulting to 0 on failure. (void)os::available_memory(avail_mem); src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1529: > 1527: } > 1528: } > 1529: (void)os::available_memory(avail_mem); Suggestion: avail_mem = 0; // Return value ignored - defaulting to 0 on failure. (void)os::available_memory(avail_mem); src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 4440: > 4438: // direct and indirect subclasses of the_class > 4439: size_t avail_mem = 0; > 4440: (void)os::available_memory(avail_mem); Suggestion: size_t avail_mem = 0; // Return value ignored - defaulting to 0 on failure. (void)os::available_memory(avail_mem); src/hotspot/share/prims/whitebox.cpp line 2516: > 2514: LINUX_ONLY(return static_cast(os::Linux::physical_memory());) > 2515: size_t phys_mem = os::physical_memory(); > 2516: return static_cast(phys_mem); Suggestion: return static_cast(LINUX_ONLY(os::Linux::physical_memory()) NOT_LINUX(os::physical_memory())); Though this seems like a bad code "smell" to me - there is something wrong with our notion of "physical memory" if we have to make this distinction. src/hotspot/share/runtime/os.hpp line 343: > 341: > 342: static size_t physical_memory(); > 343: static bool has_allocatable_memory_limit(size_t* limit); Future cleanup: for consistency should this now also take a reference? src/hotspot/share/services/heapDumper.cpp line 2616: > 2614: // Lets ensure we have at least 20MB per thread. > 2615: size_t free_memory = 0; > 2616: (void)os::free_memory(free_memory); Suggestion: size_t free_memory = 0; // Return value ignored - defaulting to 0 on failure. (void)os::free_memory(free_memory); src/hotspot/share/utilities/globalDefinitions.hpp line 1366: > 1364: > 1365: // Dummy placeholder for use of [[nodiscard]] > 1366: #define ATTRIBUTE_NODISCARD Stay-tuned we may actually be able to use `[[nodiscard]]` for real. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25450#pullrequestreview-3086467651 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253073149 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253077906 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253087855 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253092957 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253089924 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253099723 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253104891 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253106908 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253108762 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253109375 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253109727 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253112611 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253113166 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253115953 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253125260 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253126156 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253127032 From dholmes at openjdk.org Tue Aug 5 05:45:06 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Aug 2025 05:45:06 GMT Subject: RFR: 8361103: java_lang_Thread::async_get_stack_trace does not properly protect JavaThread [v6] In-Reply-To: References: Message-ID: On Wed, 30 Jul 2025 01:43:37 GMT, Alex Menkov wrote: >> The fix updates `java_lang_Thread::async_get_stack_trace()` (used by `java.lang.Thread.getStackTrace()` to get stack trace for platform and mounted virtual threads) to correctly use `ThreadListHandle` for thread protection. >> >> Testing: tier1..5 > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > fix broken build src/hotspot/share/classfile/javaClasses.cpp line 1885: > 1883: } > 1884: > 1885: oop java_lang_Thread::async_get_stack_trace(jobject jthread, TRAPS) { Please add a comment describing how the method works for virtual threads, and what returning null implies. src/hotspot/share/classfile/javaClasses.cpp line 1897: > 1895: class GetStackTraceHandshakeClosure : public HandshakeClosure { > 1896: public: > 1897: const Handle _thread_oop; Thanks for making this change! Very confusing otherwise. src/hotspot/share/classfile/javaClasses.cpp line 1939: > 1937: // Target thread has been unmounted. > 1938: return; > 1939: } Shouldn't we set `carrier = true` if we don't return here? Or am I misunderstanding what "carrier" means here? src/hotspot/share/classfile/javaClasses.cpp line 1941: > 1939: bool carrier = false; > 1940: if (java_lang_VirtualThread::is_instance(_java_thread())) { > 1941: // if (thread->vthread() != _java_thread()) // We might be inside a System.executeOnCarrierThread Need to check with @AlanBateman whether this check is also needed (he does both in his draft PR). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2253138892 PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2253140711 PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2253161591 PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2253160696 From alanb at openjdk.org Tue Aug 5 06:32:07 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 5 Aug 2025 06:32:07 GMT Subject: RFR: 8361103: java_lang_Thread::async_get_stack_trace does not properly protect JavaThread [v6] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 05:39:03 GMT, David Holmes wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> fix broken build > > src/hotspot/share/classfile/javaClasses.cpp line 1941: > >> 1939: bool carrier = false; >> 1940: if (java_lang_VirtualThread::is_instance(_java_thread())) { >> 1941: // if (thread->vthread() != _java_thread()) // We might be inside a System.executeOnCarrierThread > > Need to check with @AlanBateman whether this check is also needed (he does both in his draft PR). ThreadSnapshotFactory::get_thread_snapshot needs to fail (and the caller retry) if called while the target virtual thread is in transition. This is because it captures the thread state in addition to the stack trace, and several other properties. So this is why (in the draft changes) it checks that thread identity is set and that the continuation is mounted. java_lang_Thread::async_get_stack_trace just needs to walk the continuation stack so checking that the continuation is mounted should be enough here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2253284124 From jsikstro at openjdk.org Tue Aug 5 07:47:13 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Tue, 5 Aug 2025 07:47:13 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v7] In-Reply-To: References: Message-ID: <_FLCi2eIdu0hzKrNBb08oFhVXcHZ5EFFo-mC53i12h8=.1b6c0c6a-22ed-4f6a-a8fb-eb0aadd83c6a@github.com> > Hello, > > PROPERFMT is defined as the format string "%zu%s", which expects a size_t as input argument. When used in combination with PROPERFMTARGS, which uses the templated byte_size_in_proper_units, the byte size may not be size_t if the input is some other type. > > To minimize confusion, PROPERFMTARGS should always use the size_t template specilization of byte_size_in_proper_units, to match PROPERFMT. Places that use byte_size_in_proper_units with other types can still use it, but should use their own format strings instead of PROPERFMT. > > ProcSmapsSummary::print_on in memMapPrinter_macosx and PSAdaptiveSizePolicy::print_stats in psAdaptiveSizePolicy.cpp are the only places that use PROPERFMTARGS with a type that is not size_t. I have changed those places to use the expanded version of the macro, which uses the templated version of byte_size_in_proper_unit instead. > > Testing: > * Oracle's tier1-2 Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into JDK-8360515_properfmtargs - Revert extra whitespace - Replace PROPERFMTARGs with expanded version in psAdaptiveSizePolicy.cpp - Other alternative - More consistency - Merge branch 'master' into JDK-8360515_properfmtargs - Merge branch 'master' into JDK-8360515_properfmtargs - 8360515: PROPERFMTARGS should always use size_t template specialization for unit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25975/files - new: https://git.openjdk.org/jdk/pull/25975/files/b4f7c18e..b5bb4b4e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25975&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25975&range=05-06 Stats: 16883 lines in 442 files changed: 10489 ins; 5424 del; 970 mod Patch: https://git.openjdk.org/jdk/pull/25975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25975/head:pull/25975 PR: https://git.openjdk.org/jdk/pull/25975 From jsikstro at openjdk.org Tue Aug 5 07:47:14 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Tue, 5 Aug 2025 07:47:14 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v5] In-Reply-To: References: Message-ID: On Mon, 28 Jul 2025 14:15:11 GMT, Thomas Stuefe wrote: >> Joel Sikstr?m has updated the pull request incrementally with two additional commits since the last revision: >> >> - Replace PROPERFMTARGs with expanded version in psAdaptiveSizePolicy.cpp >> - Other alternative > > Why is the cast needed, now? Thank you for the reviews! @tstuefe @dholmes-ora ------------- PR Comment: https://git.openjdk.org/jdk/pull/25975#issuecomment-3153883668 From jsikstro at openjdk.org Tue Aug 5 07:47:15 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Tue, 5 Aug 2025 07:47:15 GMT Subject: Integrated: 8360515: PROPERFMTARGS should always use size_t template specialization for unit In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:10:41 GMT, Joel Sikstr?m wrote: > Hello, > > PROPERFMT is defined as the format string "%zu%s", which expects a size_t as input argument. When used in combination with PROPERFMTARGS, which uses the templated byte_size_in_proper_units, the byte size may not be size_t if the input is some other type. > > To minimize confusion, PROPERFMTARGS should always use the size_t template specilization of byte_size_in_proper_units, to match PROPERFMT. Places that use byte_size_in_proper_units with other types can still use it, but should use their own format strings instead of PROPERFMT. > > ProcSmapsSummary::print_on in memMapPrinter_macosx and PSAdaptiveSizePolicy::print_stats in psAdaptiveSizePolicy.cpp are the only places that use PROPERFMTARGS with a type that is not size_t. I have changed those places to use the expanded version of the macro, which uses the templated version of byte_size_in_proper_unit instead. > > Testing: > * Oracle's tier1-2 This pull request has now been integrated. Changeset: febd4b26 Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/febd4b26b2c87030affd9f93524e0d951cbe74e7 Stats: 8 lines in 3 files changed: 1 ins; 0 del; 7 mod 8360515: PROPERFMTARGS should always use size_t template specialization for unit Reviewed-by: dholmes, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/25975 From duke at openjdk.org Tue Aug 5 08:27:08 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 5 Aug 2025 08:27:08 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v25] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static size_t physical_memory(size_t& value); > > > The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, an unsuccessful call is logged. > > `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Addressed reviewer's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/a3898f43..fbceea99 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=23-24 Stats: 28 lines in 7 files changed: 15 ins; 3 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Tue Aug 5 08:27:16 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 5 Aug 2025 08:27:16 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 04:51:10 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - 8357086: Addressed reviewer's comments >> - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs >> - 8357086: Fixed merge conflict >> - 8357086: Removed extra line >> - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD >> - 8357086: Small fixes >> - 8357086: Made physical_memory() return void >> - 8357086: Fixed behavior of total_swap_space on Linux >> - 8357086: Addressed reviewer's comments. >> - 8357086: Fixed void conversion. >> - ... and 16 more: https://git.openjdk.org/jdk/compare/57553ca1...a3898f43 > > src/hotspot/os/windows/os_windows.cpp line 870: > >> 868: } else { >> 869: return false; >> 870: } > > Suggestion: > > BOOL res = GlobalMemoryStatusEx(&ms); > if (res == TRUE) { > value = static_cast(ms.ullAvailPhys); > } > return res == TRUE; > } > > Unfortunately `BOOL` is an implicit boolean. > > This applies to all the Windows functions. > > You could also assert that `res == TRUE` and report `GetlastError` if not. Addressed, though it looks like GlobalMemoryStatusEx never fails. > src/hotspot/share/jfr/jni/jfrJniMethod.cpp line 418: > >> 416: #else >> 417: size_t phys_mem = os::physical_memory(); >> 418: return static_cast(phys_mem); > > Suggestion: > > return static_cast(os::physical_memory()); Addressed. > src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 532: > >> 530: TRACE_REQUEST_FUNC(PhysicalMemory) { >> 531: size_t phys_mem = os::physical_memory(); >> 532: u8 totalPhysicalMemory = static_cast(phys_mem); > > Suggestion: > > u8 totalPhysicalMemory = static_cast(os::physical_memory()); Addressed. > src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 536: > >> 534: event.set_totalSize(totalPhysicalMemory); >> 535: size_t avail_mem = 0; >> 536: (void)os::available_memory(avail_mem); > > Suggestion: > > size_t avail_mem = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::available_memory(avail_mem); Addressed. > src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 547: > >> 545: event.set_totalSize(static_cast(total_swap_space)); >> 546: size_t free_swap_space = 0; >> 547: (void)os::free_swap_space(free_swap_space); > > Suggestion: > > size_t total_swap_space = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::total_swap_space(total_swap_space); > event.set_totalSize(static_cast(total_swap_space)); > size_t free_swap_space = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::free_swap_space(free_swap_space); Addressed. > src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1362: > >> 1360: InstanceKlass* the_class = get_ik(_class_defs[i].klass); >> 1361: size_t avail_mem = 0; >> 1362: (void)os::available_memory(avail_mem); > > Suggestion: > > size_t avail_mem = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::available_memory(avail_mem); Addressed. > src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1529: > >> 1527: } >> 1528: } >> 1529: (void)os::available_memory(avail_mem); > > Suggestion: > > avail_mem = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::available_memory(avail_mem); Addressed. > src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 4440: > >> 4438: // direct and indirect subclasses of the_class >> 4439: size_t avail_mem = 0; >> 4440: (void)os::available_memory(avail_mem); > > Suggestion: > > size_t avail_mem = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::available_memory(avail_mem); Addressed. > src/hotspot/share/prims/whitebox.cpp line 2516: > >> 2514: LINUX_ONLY(return static_cast(os::Linux::physical_memory());) >> 2515: size_t phys_mem = os::physical_memory(); >> 2516: return static_cast(phys_mem); > > Suggestion: > > return static_cast(LINUX_ONLY(os::Linux::physical_memory()) NOT_LINUX(os::physical_memory())); > > Though this seems like a bad code "smell" to me - there is something wrong with our notion of "physical memory" if we have to make this distinction. Addressed. > src/hotspot/share/services/heapDumper.cpp line 2616: > >> 2614: // Lets ensure we have at least 20MB per thread. >> 2615: size_t free_memory = 0; >> 2616: (void)os::free_memory(free_memory); > > Suggestion: > > size_t free_memory = 0; > // Return value ignored - defaulting to 0 on failure. > (void)os::free_memory(free_memory); Addressed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253531025 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253530680 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253530386 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253529920 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253529670 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253529181 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253528912 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253528737 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253527398 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253527011 From duke at openjdk.org Tue Aug 5 08:29:15 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 5 Aug 2025 08:29:15 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: References: Message-ID: <7685_Xh8MjRqxoEx2BWfpTcdPXq5-aDuo0XZfTcco5w=.00416a73-5237-4381-a79d-8973f47b19ed@github.com> On Tue, 5 Aug 2025 04:39:49 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - 8357086: Addressed reviewer's comments >> - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs >> - 8357086: Fixed merge conflict >> - 8357086: Removed extra line >> - 8357086: Made physical_memory() return size_t, added dummy ATTRIBUTE_NODISCARD >> - 8357086: Small fixes >> - 8357086: Made physical_memory() return void >> - 8357086: Fixed behavior of total_swap_space on Linux >> - 8357086: Addressed reviewer's comments. >> - 8357086: Fixed void conversion. >> - ... and 16 more: https://git.openjdk.org/jdk/compare/57553ca1...a3898f43 > > src/hotspot/os/linux/os_linux.cpp line 297: > >> 295: if (OSContainer::is_containerized() && OSContainer::memory_and_swap_limit_in_bytes() > 0) { >> 296: if (OSContainer::memory_limit_in_bytes() > 0) { >> 297: value = static_cast(OSContainer::memory_and_swap_limit_in_bytes() - OSContainer::memory_limit_in_bytes()); > > Pre-existing but we should really only calls these functions once and store the result in a local. Addressed. > src/hotspot/os/linux/os_linux.cpp line 328: > >> 326: bool host_free_swap_ok = host_free_swap_f(host_free_swap); >> 327: if (!total_swap_space_ok || !host_free_swap_ok) { >> 328: assert(false, "sysinfo failed ? "); > > Pre-existing: the assert should be pushed down to where the `sysinfo` calls are so you could see which one failed - not that they can actually fail if called correctly. I think this would change the usage pattern again. Do we want a failing assert in every memory function in case of failure? I thought the consensus is that a false values returned by the function is indicating that. In this particular case it will be easy to trace which method failed as their results are stored as local variables. > src/hotspot/os/linux/os_linux.cpp line 330: > >> 328: assert(false, "sysinfo failed ? "); >> 329: return false; >> 330: } > > Suggestion: > > size_t total_swap_space = 0; > size_t host_free_swap = 0; > if (!os::total_swap_space(total_swap_space) || !host_free_swap_f(host_free_swap)) { > assert(false, "sysinfo failed ? "); > return false; > } I suggest to keep results as locals, see my previous comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253544122 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253543879 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2253543435 From dholmes at openjdk.org Tue Aug 5 08:35:11 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Aug 2025 08:35:11 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 15:49:57 GMT, Chen Liang wrote: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). VM side seems reasonable. Java side looks a bit clunky but that is up to core-libs folk to evaluate. General note: if comments start with a capital they should end with a period, and vice-versa. Extended comments should consist of full sentences. Thanks. src/hotspot/share/classfile/javaClasses.cpp line 3078: > 3076: do { > 3077: // No backtrace available. > 3078: if (!iter.repeat()) return false; Suggestion: if (!iter.repeat()) { return false; } src/hotspot/share/interpreter/bytecodeUtils.cpp line 346: > 344: > 345: if (found && is_parameter) { > 346: // Check MethodParameters for a name, if it carries a name Suggestion: // check MethodParameters for a name, if it carries a name src/hotspot/share/interpreter/bytecodeUtils.cpp line 354: > 352: char *var = cp->symbol_at(elem.name_cp_index)->as_C_string(); > 353: os->print("%s", var); > 354: No need for blank line src/hotspot/share/interpreter/bytecodeUtils.cpp line 1483: > 1481: // Is an explicit slot given? > 1482: bool explicit_search = slot >= 0; > 1483: if (explicit_search) { Suggestion: if (slot >= 0) { No need for the temporary local. src/hotspot/share/interpreter/bytecodeUtils.hpp line 40: > 38: // Slot can be nonnegative to indicate an explicit search for the source of null > 39: // If slot is negative (default), also search for the action that caused the NPE before > 40: // deriving the actual slot and source of null by code parsing Suggestion: // Slot can be nonnegative to indicate an explicit search for the source of null. // If slot is negative (default), also search for the action that caused the NPE before // deriving the actual slot and source of null by code parsing. Periods at the end of sentences in comments. src/java.base/share/classes/java/lang/NullPointerException.java line 78: > 76: NullPointerException(int stackOffset, int searchSlot) { > 77: extendedMessageState = setupCustomBackTrace(stackOffset, searchSlot); > 78: this(); I thought `this()` had to come first in a constructor? Is this new in 25 or 26? src/java.base/share/classes/java/lang/NullPointerException.java line 101: > 99: SEARCH_SLOT_MASK = SEARCH_SLOT_MAX << SEARCH_SLOT_SHIFT; > 100: > 101: // Access these fields in object monitor only Suggestion: // Access these fields only while holding this object's monitor lock. src/java.base/share/classes/java/lang/NullPointerException.java line 162: > 160: /// and the search slot will be derived by bytecode tracing. The message > 161: /// will also include the action that caused the NPE besides the source of > 162: /// the `null`. Suggestion: /// will also include the action that caused the NPE along with the source /// of the `null`. ------------- PR Review: https://git.openjdk.org/jdk/pull/26600#pullrequestreview-3087045581 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253509165 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253517408 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253519550 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253472224 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253481631 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253558481 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253541806 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253487877 From mhaessig at openjdk.org Tue Aug 5 08:55:06 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Tue, 5 Aug 2025 08:55:06 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> Message-ID: On Tue, 5 Aug 2025 03:56:41 GMT, Dean Long wrote: >> Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8308094-timeout >> - Fix SIGALRM test >> - Add timeout functionality to compiler threads > > src/hotspot/share/compiler/compilerThread.cpp line 97: > >> 95: switch (signo) { >> 96: case TIMEOUT_SIGNAL: { >> 97: assert(!Atomic::load_acquire(&_timeout_armed), "compile task timed out"); > > Why do we need acquire? Only the current thread is ever going to be looking at this value, right? The compiler thread setting and unsetting the flag and the signal handler reading the flag are racing each other as soon as the timer is set, since signals are preemptive. This prevents a few false positive timeouts on architectures with weak memory models, but does not have any effect on x86 for example. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2253611427 From jsjolen at openjdk.org Tue Aug 5 09:59:03 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 5 Aug 2025 09:59:03 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 08:32:33 GMT, David Holmes wrote: >> Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). > > src/java.base/share/classes/java/lang/NullPointerException.java line 78: > >> 76: NullPointerException(int stackOffset, int searchSlot) { >> 77: extendedMessageState = setupCustomBackTrace(stackOffset, searchSlot); >> 78: this(); > > I thought `this()` had to come first in a constructor? Is this new in 25 or 26? https://openjdk.org/jeps/492 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253816429 From mdoerr at openjdk.org Tue Aug 5 10:16:06 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 5 Aug 2025 10:16:06 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v5] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 21:26:22 GMT, Dean Long wrote: >> This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value. Further, it takes a fast-path that uses the previous direct store when at a safepoint. Combined, these changes should get us back to almost where we were before in terms of overhead. If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > one unconditional release should be enough Thanks for implementing nice code for PPC64! I appreciate it! The shared code and the other platforms look fine, too. Maybe atomic bitwise operations could be used, but I'm happy with your current solution. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26399#pullrequestreview-3087610323 From jsjolen at openjdk.org Tue Aug 5 10:23:07 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 5 Aug 2025 10:23:07 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Fri, 1 Aug 2025 15:49:57 GMT, Chen Liang wrote: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). Still reading this. A few comments, mostly intended to make this easier to grok for future readers. The actual logic seems fine from a first read. src/java.base/share/classes/java/lang/NullPointerException.java line 159: > 157: /// configurations to trace how a particular argument, which turns out to > 158: /// be `null`, was evaluated. > 159: /// 2. `searchSlot < 0`, stack offset is 0 (a call to the nullary constructor) Can `searchSlot == -1` here? ------------- PR Review: https://git.openjdk.org/jdk/pull/26600#pullrequestreview-3087588630 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253882841 From jsjolen at openjdk.org Tue Aug 5 10:23:08 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 5 Aug 2025 10:23:08 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 07:55:41 GMT, David Holmes wrote: >> Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). > > src/hotspot/share/interpreter/bytecodeUtils.cpp line 1483: > >> 1481: // Is an explicit slot given? >> 1482: bool explicit_search = slot >= 0; >> 1483: if (explicit_search) { > > Suggestion: > > if (slot >= 0) { > > No need for the temporary local. If we don't adopt the enum I suggested, then I do prefer the naming of the condition through a variable. It gives you a hint of what the check is looking for, and not only what the check does. > src/hotspot/share/interpreter/bytecodeUtils.hpp line 40: > >> 38: // Slot can be nonnegative to indicate an explicit search for the source of null >> 39: // If slot is negative (default), also search for the action that caused the NPE before >> 40: // deriving the actual slot and source of null by code parsing > > Suggestion: > > // Slot can be nonnegative to indicate an explicit search for the source of null. > // If slot is negative (default), also search for the action that caused the NPE before > // deriving the actual slot and source of null by code parsing. > > Periods at the end of sentences in comments. I'd like to see the description for `slot` pushed into an enum, and make any negative number except `-1` explicitly forbidden. ```c++ enum class NPESlot : int { // docs here None = -1, // docs here Explicit // Anything >= 0 }; bool get_NPE_message_at(outputStream* ss, Method* method, int bci, NPESlot slot) { if (slot != NPESlot::None) { // Explicit search } } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253888922 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2253865251 From mhaessig at openjdk.org Tue Aug 5 10:32:07 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Tue, 5 Aug 2025 10:32:07 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> Message-ID: On Tue, 5 Aug 2025 03:55:35 GMT, Dean Long wrote: >> Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8308094-timeout >> - Fix SIGALRM test >> - Add timeout functionality to compiler threads > > src/hotspot/share/compiler/compileBroker.cpp line 236: > >> 234: { >> 235: MutexLocker notifier(thread, CompileTaskWait_lock); >> 236: thread->timeout_disarm(); > > Is holding the lock above important for disasming? If not, can we move the disarms from the if/else branches and do it unconditionally before the if? It is not. I'll move it above the `if`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2253919668 From fgao at openjdk.org Tue Aug 5 10:36:17 2025 From: fgao at openjdk.org (Fei Gao) Date: Tue, 5 Aug 2025 10:36:17 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() Message-ID: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> In the existing implementation, the static call stub typically emits a sequence like: `isb; movk; movz; movz; movk; movz; movz; br`. This patch reimplements it using a more compact and patch-friendly sequence: ldr x12, Label_data ldr x8, Label_entry br x8 Label_data: 0x00000000 0x00000000 Label_entry: 0x00000000 0x00000000 The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. Benchmark (length) Mode Cnt Master Patch Units StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op All tests in Tier1 to Tier3, under both release and debug builds, have passed. [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads ------------- Commit messages: - 8363620: AArch64: reimplement emit_static_call_stub() Changes: https://git.openjdk.org/jdk/pull/26638/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26638&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8363620 Stats: 322 lines in 8 files changed: 135 ins; 167 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/26638.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26638/head:pull/26638 PR: https://git.openjdk.org/jdk/pull/26638 From ayang at openjdk.org Tue Aug 5 10:42:08 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 5 Aug 2025 10:42:08 GMT Subject: RFR: 8364254: Serial: Remove soft ref policy update in WhiteBox FullGC [v2] In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 13:49:14 GMT, Albert Mingkun Yang wrote: >> Simple removing Serial specific logic in `WB_FullGC`, because Serial full-gc handles white-box full-gc already. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into sgc-whitebox > - sgc-whitebox Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26528#issuecomment-3154564288 From ayang at openjdk.org Tue Aug 5 10:46:08 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 5 Aug 2025 10:46:08 GMT Subject: Integrated: 8364254: Serial: Remove soft ref policy update in WhiteBox FullGC In-Reply-To: References: Message-ID: <0TwYYkT3Exg_w78Ku8Dv5ugVgDl22eGDsx1Y2zVoccA=.273195b8-cb8c-434a-87ac-d1d20623a8c0@github.com> On Tue, 29 Jul 2025 09:37:26 GMT, Albert Mingkun Yang wrote: > Simple removing Serial specific logic in `WB_FullGC`, because Serial full-gc handles white-box full-gc already. > > Test: tier1-3 This pull request has now been integrated. Changeset: ba0ae4cb Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/ba0ae4cb28aa520d5244077349e35ef1bb475b61 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod 8364254: Serial: Remove soft ref policy update in WhiteBox FullGC Reviewed-by: tschatzl, sangheki ------------- PR: https://git.openjdk.org/jdk/pull/26528 From fgao at openjdk.org Tue Aug 5 11:48:08 2025 From: fgao at openjdk.org (Fei Gao) Date: Tue, 5 Aug 2025 11:48:08 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 5 Aug 2025 10:30:13 GMT, Fei Gao wrote: > In the existing implementation, the static call stub typically emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly sequence: > > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > > The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. > > While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. > > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > > All tests in Tier1 to Tier3, under both release and debug builds, have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 989: > 987: ldr(rscratch1, far_jump_entry); > 988: br(rscratch1); > 989: bind(far_jump_metadata); I?m considering whether the data here should be 8-byte aligned, similar to what we did for the trampoline stubs https://github.com/openjdk/jdk/blob/743c821289a6562972364b5dcce8dd29a786264a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L950 I tried with a small case: @CompilerControl(CompilerControl.Mode.EXCLUDE) public static void callInterpreted0(int i) { val0 = i; } @CompilerControl(CompilerControl.Mode.EXCLUDE) public static void callInterpreted1(int i) { val1 = i; } @CompilerControl(CompilerControl.Mode.EXCLUDE) public static void callInterpreted2(int i) { val2 = i; } @Benchmark public void callCompiled() { for (int i = 0; i < length; i++) { callInterpreted0(i); // Make sure this is excluded from compilation callInterpreted1(i); callInterpreted2(i); } } where the static stubs are laid out as follows: 0x0000eee528201dd0: ldr x8, 0x0000eee528201dd8 ; {trampoline_stub} 0x0000eee528201dd4: br x8 0x0000eee528201dd8: .inst 0x2f69fe40 ; undefined 0x0000eee528201ddc: .inst 0x0000eee5 ; undefined 0x0000eee528201de0: ldr x8, 0x0000eee528201de8 ; {trampoline_stub} 0x0000eee528201de4: br x8 0x0000eee528201de8: .inst 0x2f69fe40 ; undefined 0x0000eee528201dec: .inst 0x0000eee5 ; undefined 0x0000eee528201df0: ldr x8, 0x0000eee528201df8 ; {trampoline_stub} 0x0000eee528201df4: br x8 0x0000eee528201df8: .inst 0x2f69fe40 ; undefined 0x0000eee528201dfc: .inst 0x0000eee5 ; undefined 0x0000eee528201e00: ldr x12, 0x0000eee528201e0c ; {static_stub} 0x0000eee528201e04: ldr x8, 0x0000eee528201e14 0x0000eee528201e08: br x8 0x0000eee528201e0c: .inst 0x00000000 ; undefined 0x0000eee528201e10: .inst 0x00000000 ; undefined 0x0000eee528201e14: .inst 0x00000000 ; undefined 0x0000eee528201e18: .inst 0x00000000 ; undefined 0x0000eee528201e1c: ldr x12, 0x0000eee528201e28 ; {static_stub} 0x0000eee528201e20: ldr x8, 0x0000eee528201e30 0x0000eee528201e24: br x8 0x0000eee528201e28: .inst 0x00000000 ; undefined 0x0000eee528201e2c: .inst 0x00000000 ; undefined 0x0000eee528201e30: .inst 0x00000000 ; undefined 0x0000eee528201e34: .inst 0x00000000 ; undefined 0x0000eee528201e38: ldr x12, 0x0000eee528201e44 ; {static_stub} 0x0000eee528201e3c: ldr x8, 0x0000eee528201e4c 0x0000eee528201e40: br x8 0x0000eee528201e44: .inst 0x00000000 ; undefined 0x0000eee528201e48: .inst 0x00000000 ; undefined 0x0000eee528201e4c: .inst 0x00000000 ; undefined 0x0000eee528201e50: .inst 0x00000000 ; undefined Here are the performance results: Benchmark (length) Mode Cnt Master Aligned Unaligned Units StaticCallStub.StaticCallStubFar.callCompiled 1000 avgt 5 114.794 63.117 64.346 us/op StaticCallStub.StaticCallStubFar.callCompiled 10000 avgt 5 1136.016 618.576 619.629 us/op StaticCallStub.StaticCallStubFar.callCompiled 100000 avgt 5 11323.945 6191.452 6277.813 us/op StaticCallStub.StaticCallStubNear.callCompiled 1000 avgt 5 114.335 63.142 64.091 us/op StaticCallStub.StaticCallStubNear.callCompiled 10000 avgt 5 1140.667 618.653 619.861 us/op StaticCallStub.StaticCallStubNear.callCompiled 100000 avgt 5 11351.394 6194.946 6195.255 us/op We have several aspects to consider: - 8-byte alignment brings a minor performance gain, but it?s not significant compared to the overall improvement achieved by reimplementing the static stubs. - Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. - The alignment requirement for trampoline stubs doesn?t always introduce extra NOPs?padding is only added when trampoline and static stubs are interleaved. In contrast, enforcing 8-byte alignment for static stubs almost always introduces padding, since the first three instructions are not naturally aligned. - We should carefully balance the trade-off between code size increase and code hotness (i.e., how frequently the stub code is executed). Any feedback or suggestions would be greatly appreciated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26638#discussion_r2254100081 From adinn at openjdk.org Tue Aug 5 13:28:18 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 5 Aug 2025 13:28:18 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines Message-ID: This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. ------------- Commit messages: - Handle failed compiler blob allocate gracefully Changes: https://git.openjdk.org/jdk/pull/26642/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26642&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364558 Stats: 21 lines in 2 files changed: 17 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26642.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26642/head:pull/26642 PR: https://git.openjdk.org/jdk/pull/26642 From asmehra at openjdk.org Tue Aug 5 13:34:03 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 5 Aug 2025 13:34:03 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v4] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 29 Jul 2025 07:19:31 GMT, Johan Sj?len wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> More compile failure fix >> >> Signed-off-by: Ashutosh Mehra > > It seems like we can just use a `stringStream` instead and not have to deal with any overflow of some buffer, since we strdup at the end. @jdksjolen ping ------------- PR Comment: https://git.openjdk.org/jdk/pull/26515#issuecomment-3155229520 From duke at openjdk.org Tue Aug 5 13:46:06 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Tue, 5 Aug 2025 13:46:06 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 13:21:01 GMT, Andrew Dinn wrote: > This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. src/hotspot/share/opto/c2compiler.cpp line 95: > 93: // If there was an error generating the blob then UseCompiler will > 94: // have been unset and we need to skip the remaining initialization > 95: if (UseCompiler) { Inverting the if condition here would result in a slightly more readable code: you could early return `false`, avoid the additional indentation, and the comment at L93-94 would refer to the true branch of the `if` statement. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26642#discussion_r2254396057 From jsjolen at openjdk.org Tue Aug 5 14:11:10 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 5 Aug 2025 14:11:10 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v4] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Wed, 30 Jul 2025 15:17:57 GMT, Ashutosh Mehra wrote: >> This PR implements the code for cpu features names string using ~FormatBuffer~ stringStream. ~It also improves and extends FormatBuffer with an additional method that appends comma separated strings to the buffer.~ It also adds a method to stringStream that that appends separated strings separated by a delimiter to the buffer. This is useful in creating the cpu features names string. >> This code will also be useful in Leyden to implement cpu feature check for AOTCodeCache [0]. >> Platforms affected: x86-64 and aarch64 >> Other platforms can be done if and when Leyden changes are ported to them. >> >> [0] https://github.com/openjdk/leyden/pull/84 > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > More compile failure fix > > Signed-off-by: Ashutosh Mehra This looks good to me! Added a few comments on the `join` method. src/hotspot/share/utilities/ostream.hpp line 170: > 168: > 169: // Append strings returned by gen, separating each with separator. > 170: // Stops when gen returns null or when buffer is out of space. We never really stop when the buffer runs out of space. src/hotspot/share/utilities/ostream.hpp line 172: > 170: // Stops when gen returns null or when buffer is out of space. > 171: template > 172: void join(Generator gen, const char* separator) { We can move this to the `outputStream` class instead! src/hotspot/share/utilities/ostream.hpp line 181: > 179: str = gen(); > 180: } > 181: return; Let's delete this return ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26515#pullrequestreview-3088470522 PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2254477968 PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2254478897 PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2254473839 From adinn at openjdk.org Tue Aug 5 14:13:10 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 5 Aug 2025 14:13:10 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 13:21:01 GMT, Andrew Dinn wrote: > This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. @vnkozlov This is still intermittent as it depends on a race. However, I ran this repeatedly with the initial and reserved cache sizes that caused the problem and with stubs logging enabled and observed execution to continue in cases where the race would previously have caused a crash. Here is the outpout from such a run. Look for the new stubs log warning message in the following output: [adinn at clarapandy JDK-8364558]$ build/linux-aarch64-server-release/images/jdk/bin/java -XX:InitialCodeCacheSize=873K -XX:ReservedCodeCacheSize=995k -Xlog:stubs -version [0.001s][info][stubs] StubRoutines (preuniverse stubs) not generated [0.003s][info][stubs] StubRoutines (initial stubs) [0x0000ffff5bfb4080, 0x0000ffff5bfb6a10] used: 5388, free: 5252 [0.003s][info][stubs] StubRoutines (continuation stubs) [0x0000ffff5bfb7000, 0x0000ffff5bfb7a50] used: 600, free: 2040 [0.004s][info][stubs] StubRoutines (final stubs) [0x0000ffff5bff58c0, 0x0000ffff5c00f568] used: 11568, free: 94072 [0.007s][warning][codecache] CodeCache is full. Compiler has been disabled. [0.007s][warning][codecache] Try increasing the code cache size using -XX:ReservedCodeCacheSize= OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= OpenJDK 64-Bit Server VM warning: C1 initialization failed. Shutting down all compilers CodeCache: size=1008Kb used=444Kb max_used=1008Kb free=563Kb bounds [0x0000ffff5bfb4000, 0x0000ffff5c0b0000, 0x0000ffff5c0b0000] total_blobs=215, nmethods=0, adapters=177, full_count=2 Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0 [0.007s][warning][stubs ] Ignoring failed allocation of blob Stub Generator compiler_blob under compiler thread OpenJDK 64-Bit Server VM warning: C2 initialization failed. Shutting down all compilers openjdk version "26-internal" 2026-03-17 OpenJDK Runtime Environment (build 26-internal-adhoc.adinn.JDK-8364558) OpenJDK 64-Bit Server VM (build 26-internal-adhoc.adinn.JDK-8364558, mixed mode, sharing) That ought to have fixed the problem with test StartOutput. However, after running more times I spotted this (very much less frequent) error exit: [adinn at clarapandy JDK-8364558]$ build/linux-aarch64-server-release/images/jdk/bin/java -XX:InitialCodeCacheSize=873K -XX:ReservedCodeCacheSize=995k -Xlog:stubs -version [0.001s][info][stubs] StubRoutines (preuniverse stubs) not generated [0.003s][info][stubs] StubRoutines (initial stubs) [0x0000ffff6a514080, 0x0000ffff6a516a10] used: 5388, free: 5252 [0.003s][info][stubs] StubRoutines (continuation stubs) [0x0000ffff6a517000, 0x0000ffff6a517a50] used: 600, free: 2040 [0.004s][info][stubs] StubRoutines (final stubs) [0x0000ffff6a5558c0, 0x0000ffff6a56f568] used: 11568, free: 94072 [0.006s][warning][codecache] CodeCache is full. Compiler has been disabled. [0.006s][warning][codecache] Try increasing the code cache size using -XX:ReservedCodeCacheSize= OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= CodeCache: size=1008Kb used=1008Kb max_used=1008Kb free=0Kb bounds [0x0000ffff6a514000, 0x0000ffff6a610000, 0x0000ffff6a610000] total_blobs=216, nmethods=0, adapters=177, full_count=3 Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0 OpenJDK 64-Bit Server VM warning: C1 initialization failed. Shutting down all compilers [0.006s][warning][stubs ] Ignoring failed allocation of blob Stub Generator compiler_blob under compiler thread OpenJDK 64-Bit Server VM warning: C2 initialization failed. Shutting down all compilers Error occurred during initialization of boot layer java.lang.OutOfMemoryError: Out of space in CodeCache for adapters So, in this case the warning shows that CodeCache Out of space problem has been finessed under `C2Compiler::initialize` but we still suffer a VM crash because of a failure to allocate an adapter. I am not sure how we can handle this cleanly. This will be happening when `Method::make_adapters` receives nullptr as the return value from `AdapterHandlerLibrary::get_adapter()`. `make_adapters` is currently coded to call `vm_exit_during_initialization` if `is_init_completed()` returns `false` and throw an `OutOfMemoryError` if `is_init_completed()` returns `true`. I think the problem here will be that in the rare occurrence where a C1 compilation is in flight at the point where the C2 failure disables compilation `is_init_completed()` will return `true` and we will be in the middle of linking the method whose adapter is being created. So, even if we wanted to carry on here with compilation disabled it is hard to know how to back out of this gracefully. Any ideas how to proceed? Shall I still continue wiht the current patch? Also, do we need a better title for the JIRA and PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26642#issuecomment-3155395403 From rriggs at openjdk.org Tue Aug 5 14:22:05 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Tue, 5 Aug 2025 14:22:05 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs In-Reply-To: References: Message-ID: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> On Fri, 1 Aug 2025 15:49:57 GMT, Chen Liang wrote: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). src/hotspot/share/interpreter/bytecodeUtils.cpp line 1514: > 1512: return true; > 1513: } > 1514: Extra new line src/java.base/share/classes/java/lang/NullPointerException.java line 75: > 73: > 74: /// Creates an NPE with a custom backtrace configuration. > 75: /// The exception has no message if detailed NPE is not enabled. Don't mix markdown comments with regular javadoc comments, it just looks inconsistent without adding value. Use regular // comments. src/java.base/share/classes/java/lang/NullPointerException.java line 142: > 140: } > 141: > 142: // Methods below must be called in object monitor This kind of warning should be on every method declaration, if separated from the method doc, it can be overlooked by future maintainers. src/java.base/share/classes/java/lang/NullPointerException.java line 146: > 144: private void ensureMessageComputed() { > 145: if ((extendedMessageState & (MESSAGE_COMPUTED | CONSTRUCTOR_FINISHED)) == CONSTRUCTOR_FINISHED) { > 146: int stackOffset = (extendedMessageState & STACK_OFFSET_MASK) >> STACK_OFFSET_SHIFT; This would be more readable if private utility methods were added that encapsulated the shift and masking, both for extraction and construction. src/java.base/share/classes/jdk/internal/access/JavaLangAccess.java line 637: > 635: /// Stack offset is the number of non-hidden frames to skip, pointing to the null-checking API. > 636: /// Search slot is the slot where the null-checked value is passed in. > 637: NullPointerException extendedNullPointerException(int stackOffset, int searchSlot); Method should **not** be added to SharedSecrets solely for access by tests. Tests can use @modules to gain access. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2251763870 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254483182 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254486286 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254494309 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254508852 From liach at openjdk.org Tue Aug 5 14:34:50 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 14:34:50 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v2] In-Reply-To: References: Message-ID: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). Chen Liang has updated the pull request incrementally with one additional commit since the last revision: Web review Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26600/files - new: https://git.openjdk.org/jdk/pull/26600/files/5735b802..09233e9a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=00-01 Stats: 15 lines in 4 files changed: 2 ins; 3 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/26600.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26600/head:pull/26600 PR: https://git.openjdk.org/jdk/pull/26600 From liach at openjdk.org Tue Aug 5 14:34:50 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 14:34:50 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v2] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 08:16:21 GMT, David Holmes wrote: >> Chen Liang has updated the pull request incrementally with one additional commit since the last revision: >> >> Web review >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/share/interpreter/bytecodeUtils.cpp line 354: > >> 352: char *var = cp->symbol_at(elem.name_cp_index)->as_C_string(); >> 353: os->print("%s", var); >> 354: > > No need for blank line Suggestion: ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254528382 From liach at openjdk.org Tue Aug 5 14:34:51 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 14:34:51 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v2] In-Reply-To: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> References: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> Message-ID: On Mon, 4 Aug 2025 14:58:07 GMT, Roger Riggs wrote: >> Chen Liang has updated the pull request incrementally with one additional commit since the last revision: >> >> Web review >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/share/interpreter/bytecodeUtils.cpp line 1514: > >> 1512: return true; >> 1513: } >> 1514: > > Extra new line Suggestion: > src/java.base/share/classes/java/lang/NullPointerException.java line 75: > >> 73: >> 74: /// Creates an NPE with a custom backtrace configuration. >> 75: /// The exception has no message if detailed NPE is not enabled. > > Don't mix markdown comments with regular javadoc comments, it just looks inconsistent without adding value. > Use regular // comments. Suggestion: // Creates an NPE with a custom backtrace configuration. // The exception has no message if detailed NPE is not enabled. > src/java.base/share/classes/jdk/internal/access/JavaLangAccess.java line 637: > >> 635: /// Stack offset is the number of non-hidden frames to skip, pointing to the null-checking API. >> 636: /// Search slot is the slot where the null-checked value is passed in. >> 637: NullPointerException extendedNullPointerException(int stackOffset, int searchSlot); > > Method should **not** be added to SharedSecrets solely for access by tests. > Tests can use @modules to gain access. This is intended to be an API directly called by future null-check APIs, like `Objects.requireNonNull` or `Checks.nullCheck`. I initially had code for rNN but decided to withhold that for a future patch since it involves CSR and other evaluations not related to this patch's efforts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254529734 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254532646 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254537044 From liach at openjdk.org Tue Aug 5 14:34:51 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 14:34:51 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v2] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 10:07:38 GMT, Johan Sj?len wrote: >> src/hotspot/share/interpreter/bytecodeUtils.hpp line 40: >> >>> 38: // Slot can be nonnegative to indicate an explicit search for the source of null >>> 39: // If slot is negative (default), also search for the action that caused the NPE before >>> 40: // deriving the actual slot and source of null by code parsing >> >> Suggestion: >> >> // Slot can be nonnegative to indicate an explicit search for the source of null. >> // If slot is negative (default), also search for the action that caused the NPE before >> // deriving the actual slot and source of null by code parsing. >> >> Periods at the end of sentences in comments. > > I'd like to see the description for `slot` pushed into an enum, and make any negative number except `-1` explicitly forbidden. > > ```c++ > enum class NPESlot : int { > // docs here > None = -1, > // docs here > Explicit // Anything >= 0 > }; > bool get_NPE_message_at(outputStream* ss, Method* method, int bci, NPESlot slot) { > if (slot != NPESlot::None) { > // Explicit search > } > } I don't think I know how to use C++ enums. I tried to define this enum in bytecodeUtils.hpp and use it from jvm.cpp, and don't know how in any way I can ever express `BytecodeUtils::NullSlot slot = search_slot >= 0 ? search_slot : ???;` because `NullSlot::Search` or `BytecodeUtils::NullSlot::Search` or `BytecodeUtils::NullSlot.Search` are all erroneous. I don't think this may worth the hassle since the attempt to use C++ enum only creates more pain, unless you tell me how to use it properly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254525459 From rriggs at openjdk.org Tue Aug 5 14:56:03 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Tue, 5 Aug 2025 14:56:03 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v2] In-Reply-To: References: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> Message-ID: On Tue, 5 Aug 2025 14:28:37 GMT, Chen Liang wrote: >> src/java.base/share/classes/jdk/internal/access/JavaLangAccess.java line 637: >> >>> 635: /// Stack offset is the number of non-hidden frames to skip, pointing to the null-checking API. >>> 636: /// Search slot is the slot where the null-checked value is passed in. >>> 637: NullPointerException extendedNullPointerException(int stackOffset, int searchSlot); >> >> Method should **not** be added to SharedSecrets solely for access by tests. >> Tests can use @modules to gain access. > > This is intended to be an API directly called by future null-check APIs, like `Objects.requireNonNull` or `Checks.nullCheck`. I initially had code for rNN but decided to withhold that for a future patch since it involves CSR and other evaluations not related to this patch's efforts. Don't add APIs without uses; it speculative and the API may not be what is really needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254604180 From liach at openjdk.org Tue Aug 5 15:07:13 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 15:07:13 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v3] In-Reply-To: References: Message-ID: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). Chen Liang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 14 additional commits since the last revision: - Use c++ enum classes per jdksjolen - Merge branch 'exp/requireNonNull-message-hacks' of github.com:liachmodded/jdk into exp/requireNonNull-message-hacks - Web review Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - Merge branch 'master' of https://github.com/openjdk/jdk into exp/requireNonNull-message-hacks - Years - Roll back Objects.rNN for now - Review remarks - Merge branch 'master' of https://github.com/openjdk/jdk into exp/requireNonNull-message-hacks - Bork - Try generify the NPE tracing API - ... and 4 more: https://git.openjdk.org/jdk/compare/dc4f5978...9af7ee85 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26600/files - new: https://git.openjdk.org/jdk/pull/26600/files/09233e9a..9af7ee85 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=01-02 Stats: 5643 lines in 151 files changed: 3341 ins; 1944 del; 358 mod Patch: https://git.openjdk.org/jdk/pull/26600.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26600/head:pull/26600 PR: https://git.openjdk.org/jdk/pull/26600 From liach at openjdk.org Tue Aug 5 15:07:14 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 15:07:14 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v3] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 14:24:37 GMT, Chen Liang wrote: >> I'd like to see the description for `slot` pushed into an enum, and make any negative number except `-1` explicitly forbidden. >> >> ```c++ >> enum class NPESlot : int { >> // docs here >> None = -1, >> // docs here >> Explicit // Anything >= 0 >> }; >> bool get_NPE_message_at(outputStream* ss, Method* method, int bci, NPESlot slot) { >> if (slot != NPESlot::None) { >> // Explicit search >> } >> } > > I don't think I know how to use C++ enums. I tried to define this enum in bytecodeUtils.hpp and use it from jvm.cpp, and don't know how in any way I can ever express `BytecodeUtils::NullSlot slot = search_slot >= 0 ? search_slot : ???;` because `NullSlot::Search` or `BytecodeUtils::NullSlot::Search` or `BytecodeUtils::NullSlot.Search` are all erroneous. I don't think this may worth the hassle since the attempt to use C++ enum only creates more pain, unless you tell me how to use it properly. I think I figured out how to use static_cast with C++ enum classes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254630773 From liach at openjdk.org Tue Aug 5 15:07:15 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 15:07:15 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v3] In-Reply-To: References: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> Message-ID: On Tue, 5 Aug 2025 14:53:03 GMT, Roger Riggs wrote: >> This is intended to be an API directly called by future null-check APIs, like `Objects.requireNonNull` or `Checks.nullCheck`. I initially had code for rNN but decided to withhold that for a future patch since it involves CSR and other evaluations not related to this patch's efforts. > > Don't add APIs without uses; it speculative and the API may not be what is really needed. Should I open a dependent PR to prove that this is exactly what we need to unblock [Objects::requireNonNull message enhancements](https://bugs.openjdk.org/browse/JDK-8233268)? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2254627867 From rriggs at openjdk.org Tue Aug 5 15:17:12 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Tue, 5 Aug 2025 15:17:12 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v5] In-Reply-To: References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> Message-ID: On Mon, 4 Aug 2025 15:00:52 GMT, Doug Simon wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore c2 optimization. > > src/hotspot/share/prims/jvm.cpp line 1744: > >> 1742: JVM_END >> 1743: >> 1744: JVM_ENTRY(jint, JVM_GetClassAccessFlags(JNIEnv *env, jclass cls)) > > What about the declaration in `jvm.h`? https://github.com/openjdk/jdk/blob/6c52b73465b0d0daeafc54c3c6cec3062bf490c5/src/hotspot/share/include/jvm.h#L610 Created issue https://bugs.openjdk.org/browse/JDK-8364750 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2254660805 From cstein at openjdk.org Tue Aug 5 16:03:07 2025 From: cstein at openjdk.org (Christian Stein) Date: Tue, 5 Aug 2025 16:03:07 GMT Subject: RFR: 8361950: Update to use jtreg 8 In-Reply-To: References: Message-ID: <1XRUup7RXkUWm_2yscYV2M-m4ooHr70QdwcVVB7zxDI=.d641c6f9-9dd9-4da7-8599-a164274b9d93@github.com> On Fri, 11 Jul 2025 09:05:40 GMT, Christian Stein wrote: > Please review the change to update to using jtreg 8. > > The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the requiredVersion has been updated in the various `TEST.ROOT` files. Both tests seem to make hard assumptions on where binary files of `@library` classes are stored. - [TestSPISigned.java#L60-L62](https://github.com/openjdk/jdk/blob/master/test/jdk/java/security/SignedJar/spi-calendar-provider/TestSPISigned.java#L60-L62) - [resexhausted003.java#L83-L96](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003.java#L83-L96) The storage location changed due to https://bugs.openjdk.org/browse/CODETOOLS-7902847 - it might change again in the near future. Thus, both tests need to be updated to work correctly without relying on a specific location of compiled library classes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26261#issuecomment-3155714592 From liach at openjdk.org Tue Aug 5 16:04:08 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 5 Aug 2025 16:04:08 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: Message-ID: > Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). Chen Liang has updated the pull request incrementally with one additional commit since the last revision: Update NPE per roger review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26600/files - new: https://git.openjdk.org/jdk/pull/26600/files/9af7ee85..4ba1f17c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26600&range=02-03 Stats: 33 lines in 1 file changed: 12 ins; 4 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/26600.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26600/head:pull/26600 PR: https://git.openjdk.org/jdk/pull/26600 From asmehra at openjdk.org Tue Aug 5 16:48:06 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 5 Aug 2025 16:48:06 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v4] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 5 Aug 2025 14:08:32 GMT, Johan Sj?len wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> More compile failure fix >> >> Signed-off-by: Ashutosh Mehra > > src/hotspot/share/utilities/ostream.hpp line 172: > >> 170: // Stops when gen returns null or when buffer is out of space. >> 171: template >> 172: void join(Generator gen, const char* separator) { > > We can move this to the `outputStream` class instead! It is in the `outputStream` class ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2254829472 From asmehra at openjdk.org Tue Aug 5 18:08:22 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 5 Aug 2025 18:08:22 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v5] In-Reply-To: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: > This PR implements the code for cpu features names string using ~FormatBuffer~ stringStream. ~It also improves and extends FormatBuffer with an additional method that appends comma separated strings to the buffer.~ It also adds a method to stringStream that that appends separated strings separated by a delimiter to the buffer. This is useful in creating the cpu features names string. > This code will also be useful in Leyden to implement cpu feature check for AOTCodeCache [0]. > Platforms affected: x86-64 and aarch64 > Other platforms can be done if and when Leyden changes are ported to them. > > [0] https://github.com/openjdk/leyden/pull/84 Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Address review comments Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26515/files - new: https://git.openjdk.org/jdk/pull/26515/files/21383a7d..cca14916 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26515&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26515&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26515.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26515/head:pull/26515 PR: https://git.openjdk.org/jdk/pull/26515 From asmehra at openjdk.org Tue Aug 5 18:08:23 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 5 Aug 2025 18:08:23 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v4] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: <6nXR6F1y_HxreFrbgXEacnewfw3ZGrd4B7yl3I36Lis=.95f01947-9df5-4beb-b9e2-31765327bdb9@github.com> On Tue, 5 Aug 2025 14:08:54 GMT, Johan Sj?len wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> More compile failure fix >> >> Signed-off-by: Ashutosh Mehra > > This looks good to me! Added a few comments on the `join` method. @jdksjolen new commit pushed to address your comments. > src/hotspot/share/utilities/ostream.hpp line 170: > >> 168: >> 169: // Append strings returned by gen, separating each with separator. >> 170: // Stops when gen returns null or when buffer is out of space. > > We never really stop when the buffer runs out of space. Updated the comment. > src/hotspot/share/utilities/ostream.hpp line 181: > >> 179: str = gen(); >> 180: } >> 181: return; > > Let's delete this return Done ------------- PR Comment: https://git.openjdk.org/jdk/pull/26515#issuecomment-3156103224 PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2254997542 PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2254997589 From ayang at openjdk.org Tue Aug 5 18:26:18 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 5 Aug 2025 18:26:18 GMT Subject: RFR: 8364767: G1: Remove use of CollectedHeap::_soft_ref_policy Message-ID: Use gc-cause checking in `VM_G1CollectFull` to decide soft-ref policy. Test: tier1-3 ------------- Commit messages: - g1-soft-refs Changes: https://git.openjdk.org/jdk/pull/26648/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26648&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364767 Stats: 42 lines in 6 files changed: 4 ins; 35 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26648.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26648/head:pull/26648 PR: https://git.openjdk.org/jdk/pull/26648 From coleenp at openjdk.org Tue Aug 5 19:30:12 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 5 Aug 2025 19:30:12 GMT Subject: RFR: 8364187: Make getClassAccessFlagsRaw non-native [v5] In-Reply-To: References: <84FVULnDO38-jO9EcshCFwsOx9PRZ920y8KwTt5z0xU=.6b4b8adb-b046-481f-b121-dd1ffa0d7a78@github.com> Message-ID: On Tue, 5 Aug 2025 15:14:06 GMT, Roger Riggs wrote: >> src/hotspot/share/prims/jvm.cpp line 1744: >> >>> 1742: JVM_END >>> 1743: >>> 1744: JVM_ENTRY(jint, JVM_GetClassAccessFlags(JNIEnv *env, jclass cls)) >> >> What about the declaration in `jvm.h`? https://github.com/openjdk/jdk/blob/6c52b73465b0d0daeafc54c3c6cec3062bf490c5/src/hotspot/share/include/jvm.h#L610 > > Created issue https://bugs.openjdk.org/browse/JDK-8364750 Sorry I missed the declaration. Will fix it. Thanks for filing the bug Roger. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26517#discussion_r2255153426 From dlong at openjdk.org Tue Aug 5 21:36:05 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 5 Aug 2025 21:36:05 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> Message-ID: <1YTjfZh0C7UWqTCXWcNS-Q68kBOnhlRJA_-sS_vZsio=.6389ba30-8f3a-48c2-b0e2-40f5877f4a36@github.com> On Tue, 5 Aug 2025 08:52:20 GMT, Manuel H?ssig wrote: >> src/hotspot/share/compiler/compilerThread.cpp line 97: >> >>> 95: switch (signo) { >>> 96: case TIMEOUT_SIGNAL: { >>> 97: assert(!Atomic::load_acquire(&_timeout_armed), "compile task timed out"); >> >> Why do we need acquire? Only the current thread is ever going to be looking at this value, right? > > The compiler thread setting and unsetting the flag and the signal handler reading the flag are racing each other as soon as the timer is set, since signals are preemptive. This prevents a few false positive timeouts on architectures with weak memory models, but does not have any effect on x86 for example. The signal is delivered to the same compiler thread, so I don't think acquire and release help here. There shouldn't be a weak memory model issue between a thread and it's signal handler on the same thread. Can you explain the sequence of events that could cause a false positive? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2255395595 From dlong at openjdk.org Tue Aug 5 21:43:04 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 5 Aug 2025 21:43:04 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: <62G2O_C4dHe0yNocFL7vmma7XWoyr9rTGO2p7Y6i_Z0=.15588516-31e5-4d8c-9003-92cd45fe1dd4@github.com> On Tue, 5 Aug 2025 10:30:13 GMT, Fei Gao wrote: > In the existing implementation, the static call stub typically emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly sequence: > > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > > The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. > > While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. > > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > > All tests in Tier1 to Tier3, under both release and debug builds, have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads In the past we would have scheduled the load for x8 earlier to avoid a latency hit for using the value in the next instruction: ldr x8, Label_entry ldr x12, Label_data br x8 Is this still helpful on modern aarch64 hardware? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3156732901 From dlong at openjdk.org Tue Aug 5 21:50:04 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 5 Aug 2025 21:50:04 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 5 Aug 2025 11:45:18 GMT, Fei Gao wrote: > Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. I thought other threads are modifying the data, which is the reason why we needed the ISB. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26638#discussion_r2255414513 From tschatzl at openjdk.org Tue Aug 5 21:59:00 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Aug 2025 21:59:00 GMT Subject: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v46] In-Reply-To: References: Message-ID: <8Ja1d6zTuDEThq0VR2UQfpY94bqnQql-mqIc-fa8ico=.0909c352-1208-481e-954b-8356f45e07ad@github.com> > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 63 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * remove unused G1DetachedRefinementStats_lock - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into pull/23739 - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - ... and 53 more: https://git.openjdk.org/jdk/compare/d906e450...188fc811 ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=45 Stats: 7114 lines in 112 files changed: 2583 ins; 3587 del; 944 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From amenkov at openjdk.org Tue Aug 5 22:34:46 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 5 Aug 2025 22:34:46 GMT Subject: RFR: 8361103: java_lang_Thread::async_get_stack_trace does not properly protect JavaThread [v7] In-Reply-To: References: Message-ID: <6MCWqdixHmPtZNQd86-_KSz80enDwWQW9AJEJsgnClI=.e27a474a-dc2d-4e7e-8436-789e65af3031@github.com> > The fix updates `java_lang_Thread::async_get_stack_trace()` (used by `java.lang.Thread.getStackTrace()` to get stack trace for platform and mounted virtual threads) to correctly use `ThreadListHandle` for thread protection. > > Testing: tier1..5 Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: comment to async_get_stack_trace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26119/files - new: https://git.openjdk.org/jdk/pull/26119/files/30275945..811db312 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26119&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26119&range=05-06 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26119.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26119/head:pull/26119 PR: https://git.openjdk.org/jdk/pull/26119 From amenkov at openjdk.org Tue Aug 5 22:34:47 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 5 Aug 2025 22:34:47 GMT Subject: RFR: 8361103: java_lang_Thread::async_get_stack_trace does not properly protect JavaThread [v6] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 05:22:33 GMT, David Holmes wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> fix broken build > > src/hotspot/share/classfile/javaClasses.cpp line 1885: > >> 1883: } >> 1884: >> 1885: oop java_lang_Thread::async_get_stack_trace(jobject jthread, TRAPS) { > > Please add a comment describing how the method works for virtual threads, and what returning null implies. Done. > src/hotspot/share/classfile/javaClasses.cpp line 1939: > >> 1937: // Target thread has been unmounted. >> 1938: return; >> 1939: } > > Shouldn't we set `carrier = true` if we don't return here? Or am I misunderstanding what "carrier" means here? "carrier" means target is a platform thread with mounted virtual thread. When `carrier == true` `vframeStream` skips frames of the mounted virtual thread. Here target is a virtual thread, so we need frames starting from the last frame ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2255469604 PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2255466533 From kvn at openjdk.org Tue Aug 5 23:18:03 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 5 Aug 2025 23:18:03 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 14:08:51 GMT, Andrew Dinn wrote: >> This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. > > @vnkozlov This is still intermittent as it depends on a race. However, I ran this repeatedly with the initial and reserved cache sizes that caused the problem and with stubs logging enabled and observed execution to continue in cases where the race would previously have caused a crash. Here is the outpout from such a run. Look for the new stubs log warning message in the following output: > > [adinn at clarapandy JDK-8364558]$ build/linux-aarch64-server-release/images/jdk/bin/java -XX:InitialCodeCacheSize=873K -XX:ReservedCodeCacheSize=995k -Xlog:stubs -version > [0.001s][info][stubs] StubRoutines (preuniverse stubs) not generated > [0.003s][info][stubs] StubRoutines (initial stubs) [0x0000ffff5bfb4080, 0x0000ffff5bfb6a10] used: 5388, free: 5252 > [0.003s][info][stubs] StubRoutines (continuation stubs) [0x0000ffff5bfb7000, 0x0000ffff5bfb7a50] used: 600, free: 2040 > [0.004s][info][stubs] StubRoutines (final stubs) [0x0000ffff5bff58c0, 0x0000ffff5c00f568] used: 11568, free: 94072 > [0.007s][warning][codecache] CodeCache is full. Compiler has been disabled. > [0.007s][warning][codecache] Try increasing the code cache size using -XX:ReservedCodeCacheSize= > OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. > OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= > OpenJDK 64-Bit Server VM warning: C1 initialization failed. Shutting down all compilers > CodeCache: size=1008Kb used=444Kb max_used=1008Kb free=563Kb > bounds [0x0000ffff5bfb4000, 0x0000ffff5c0b0000, 0x0000ffff5c0b0000] > total_blobs=215, nmethods=0, adapters=177, full_count=2 > Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0 > [0.007s][warning][stubs ] Ignoring failed allocation of blob Stub Generator compiler_blob under compiler thread > OpenJDK 64-Bit Server VM warning: C2 initialization failed. Shutting down all compilers > openjdk version "26-internal" 2026-03-17 > OpenJDK Runtime Environment (build 26-internal-adhoc.adinn.JDK-8364558) > OpenJDK 64-Bit Server VM (build 26-internal-adhoc.adinn.JDK-8364558, mixed mode, sharing) > > > That ought to have fixed the problem with test StartOutput. However, after running more times I spotted this (very much less frequent) error exit: > > > [adinn at clarapandy JDK-8364558]$ build/linux-aarch64-server-release/images/jdk/bin/java -XX:InitialCodeCacheSize=873K -XX:ReservedCodeCacheSize=995k -Xlog:stubs -version > [0.001s][info][stubs] StubRoutines (preuniverse stubs) not g... Thank you, @adinn, for working on this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26642#issuecomment-3156897321 From kvn at openjdk.org Tue Aug 5 23:22:02 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 5 Aug 2025 23:22:02 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 13:21:01 GMT, Andrew Dinn wrote: > This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. src/hotspot/share/runtime/stubRoutines.cpp line 192: > 190: assert(blob_id == BlobId::stubgen_compiler_id, "sanity"); > 191: assert(DelayCompilerStubsGeneration, "sanity"); > 192: log_warning(stubs)("Ignoring failed allocation of blob %s under compiler thread", StubInfo::name(blob_id)); I think it should match other stubs output, for example: [0.001s][info][stubs] StubRoutines (preuniverse stubs) not generated May be: [0.001s][warning][stubs] StubRoutines (compiler stubs) not generated: no space left in CodeCache ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26642#discussion_r2255523961 From kvn at openjdk.org Tue Aug 5 23:25:03 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 5 Aug 2025 23:25:03 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines In-Reply-To: References: Message-ID: <1xrYsWuO2i0clJuaenUnLY9U24K4IvY22ATv3WiHQuk=.0687a2e4-7723-432a-bd3b-b98451b3c376@github.com> On Tue, 5 Aug 2025 13:43:12 GMT, Francesco Andreuzzi wrote: >> This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. > > src/hotspot/share/opto/c2compiler.cpp line 95: > >> 93: // If there was an error generating the blob then UseCompiler will >> 94: // have been unset and we need to skip the remaining initialization >> 95: if (UseCompiler) { > > Inverting the if condition here would result in slightly more readable code: you could early return `false`, avoid the additional indentation, and the comment at L93-94 would refer to the true branch of the `if` statement. Should we also check it before calling `compiler_stubs_init()` too? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26642#discussion_r2255527228 From kvn at openjdk.org Tue Aug 5 23:43:01 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 5 Aug 2025 23:43:01 GMT Subject: RFR: 8364558: Test compiler/startup/StartupOutput.java failed with native OOM: CodeCache: no room for StubRoutines In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 14:08:51 GMT, Andrew Dinn wrote: > Any ideas how to proceed? Shall I still continue wiht the current patch? Don't worry about adapters. It is different issue which can happen at any time with very small CodeCache. > Also, do we need a better title for the JIRA and PR? Yes, please update it. May be: "Failure to generated compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache" ------------- PR Comment: https://git.openjdk.org/jdk/pull/26642#issuecomment-3156933123 From dlong at openjdk.org Wed Aug 6 00:17:26 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 6 Aug 2025 00:17:26 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v5] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 10:13:03 GMT, Martin Doerr wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> one unconditional release should be enough > > Thanks for implementing nice code for PPC64! I appreciate it! The shared code and the other platforms look fine, too. > Maybe atomic bitwise operations could be used, but I'm happy with your current solution. Thanks @TheRealMDoerr . I didn't even consider atomic bitwise operations, but that's a good idea. I'm not in a hurry to push this, so if you could provide an atomic bitwise patch for ppc64, I would be happy to include it. In the mean time, I'm still investigating the ZGC regression. If I can figure it out, I might want to include a fix for ZGC in this PR as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26399#issuecomment-3157006874 From dlong at openjdk.org Wed Aug 6 01:29:58 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 6 Aug 2025 01:29:58 GMT Subject: RFR: 8278874: tighten VerifyStack constraints [v10] In-Reply-To: References: Message-ID: > The VerifyStack logic in Deoptimization::unpack_frames() attempts to check the expression stack size of the interpreter frame against what GenerateOopMap computes. To do this, it needs to know if the state at the current bci represents the "before" state, meaning the bytecode will be reexecuted, or the "after" state, meaning we will advance to the next bytecode. The old code didn't know how to determine exactly what state we were in, so it checked both. This PR cleans that up, so we only have to compute the oopmap once. It also removes old SPARC support. Dean Long has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Merge branch 'openjdk:master' into 8278874-verifystack - Merge branch 'openjdk:master' into 8278874-verifystack - more cleanup - simplify is_top_frame - readability suggestion - reviewer suggestions - Update src/hotspot/share/runtime/vframeArray.cpp Co-authored-by: Manuel H?ssig - Update src/hotspot/share/runtime/vframeArray.cpp Co-authored-by: Manuel H?ssig - better name for frame index - Update src/hotspot/share/runtime/deoptimization.cpp Co-authored-by: Manuel H?ssig - ... and 3 more: https://git.openjdk.org/jdk/compare/2689ae8e...e04fc720 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26121/files - new: https://git.openjdk.org/jdk/pull/26121/files/6bfda158..e04fc720 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26121&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26121&range=08-09 Stats: 7351 lines in 213 files changed: 4752 ins; 2089 del; 510 mod Patch: https://git.openjdk.org/jdk/pull/26121.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26121/head:pull/26121 PR: https://git.openjdk.org/jdk/pull/26121 From kbarrett at openjdk.org Wed Aug 6 02:57:16 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 6 Aug 2025 02:57:16 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v5] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 10:13:03 GMT, Martin Doerr wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> one unconditional release should be enough > > Thanks for implementing nice code for PPC64! I appreciate it! The shared code and the other platforms look fine, too. > Maybe atomic bitwise operations could be used, but I'm happy with your current solution. > Thanks @TheRealMDoerr . I didn't even consider atomic bitwise operations, but that's a good idea. I'm not in a hurry to push this, so if you could provide an atomic bitwise patch for ppc64, I would be happy to include it. In the mean time, I'm still investigating the ZGC regression. If I can figure it out, I might want to include a fix for ZGC in this PR as well. Not a review, just a drive-by comment. We've had Atomic bitops for a while now. Atomic::fetch_then_{and,or,xor}(ptr, bits [, order]) Atomic::{and,or,xor}_then_fetch(ptr, bits [, order]) They haven't been optimized for most (any?) platforms, being based on cmpxchg. (See all the "Specialize atomic bitset functions for ..." related to https://bugs.openjdk.org/browse/JDK-8293117.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/26399#issuecomment-3157237736 From dholmes at openjdk.org Wed Aug 6 03:57:10 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 03:57:10 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v25] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 08:27:08 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value types for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static bool available_memory(size_t& value); >> static julong used_memory(); --> static bool used_memory(size_t& value); >> static julong free_memory(); --> static bool free_memory(size_t& value); >> static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); >> static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); >> static julong physical_memory(); --> static size_t physical_memory(size_t& value); >> >> >> The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, an unsuccessful call is logged. >> >> `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. >> >> Tested in GHA and Tiers 1-5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Addressed reviewer's comments src/hotspot/os/linux/os_linux.cpp line 299: > 297: jlong memory_and_swap_limit_in_bytes = OSContainer::memory_and_swap_limit_in_bytes(); > 298: jlong memory_limit_in_bytes = OSContainer::memory_limit_in_bytes(); > 299: value = static_cast(memory_and_swap_limit_in_bytes - memory_limit_in_bytes); This isn't what I meant. You want to save into the locals before each of the `if` statements, else you are still calling these functions twice. src/hotspot/os/windows/os_windows.cpp line 869: > 867: return true; > 868: } else { > 869: errno = ::GetLastError(); Why are you setting `errno` here and elsewhere? I see a couple of places where we do do that though I do not know why, but generally we just want to save the `::GetlastError()` value (here we do need a local) and print it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2255787665 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2255792577 From dholmes at openjdk.org Wed Aug 6 03:57:12 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 03:57:12 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: <7685_Xh8MjRqxoEx2BWfpTcdPXq5-aDuo0XZfTcco5w=.00416a73-5237-4381-a79d-8973f47b19ed@github.com> References: <7685_Xh8MjRqxoEx2BWfpTcdPXq5-aDuo0XZfTcco5w=.00416a73-5237-4381-a79d-8973f47b19ed@github.com> Message-ID: On Tue, 5 Aug 2025 08:26:30 GMT, Anton Artemov wrote: >> src/hotspot/os/linux/os_linux.cpp line 328: >> >>> 326: bool host_free_swap_ok = host_free_swap_f(host_free_swap); >>> 327: if (!total_swap_space_ok || !host_free_swap_ok) { >>> 328: assert(false, "sysinfo failed ? "); >> >> Pre-existing: the assert should be pushed down to where the `sysinfo` calls are so you could see which one failed - not that they can actually fail if called correctly. > > I think this would change the usage pattern again. Do we want a failing assert in every memory function in case of failure? I thought the consensus is that a false values returned by the function is indicating that. In this particular case it will be easy to trace which method failed as their results are stored as local variables. The assert should go after the syscall that we don't expect to fail, but which we have to tolerate the possibility of failing. That is the place where we can also report why the call failed. I'm not sure what you mean about changing the usage pattern. >> src/hotspot/os/linux/os_linux.cpp line 330: >> >>> 328: assert(false, "sysinfo failed ? "); >>> 329: return false; >>> 330: } >> >> Suggestion: >> >> size_t total_swap_space = 0; >> size_t host_free_swap = 0; >> if (!os::total_swap_space(total_swap_space) || !host_free_swap_f(host_free_swap)) { >> assert(false, "sysinfo failed ? "); >> return false; >> } > > I suggest to keep results as locals, see my previous comment. Sorry I don't see a connection to previous comment about the assert. You only need a local if you need to refer to something more than once. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2255794954 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2255796528 From swen at openjdk.org Wed Aug 6 05:27:14 2025 From: swen at openjdk.org (Shaojin Wen) Date: Wed, 6 Aug 2025 05:27:14 GMT Subject: RFR: 8361842: Move input validation checks to Java for java.lang.StringCoding intrinsics [v13] In-Reply-To: References: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> Message-ID: On Fri, 25 Jul 2025 07:40:46 GMT, Volkan Yazici wrote: >> Validate input in `java.lang.StringCoding` intrinsic Java wrappers, improve their documentation, enhance the checks in the associated IR or assembly code, and adapt them to cause VM crash on invalid input. >> >> ## Implementation notes >> >> The goal of the associated umbrella issue [JDK-8156534](https://bugs.openjdk.org/browse/JDK-8156534) is to, for `java.lang.String*` classes, >> >> 1. Move `@IntrinsicCandidate`-annotated `public` methods1 (in Java code) to `private` ones, and wrap them with a `public` ["front door" method](https://github.com/openjdk/jdk/pull/24982#discussion_r2087493446) >> 2. Since we moved the `@IntrinsicCandidate` annotation to a new method, intrinsic mappings ? i.e., associated `do_intrinsic()` calls in `vmIntrinsics.hpp` ? need to be updated too >> 3. Add necessary input validation (range, null, etc.) checks to the newly created public front door method >> 4. Place all input validation checks in the intrinsic code (add if missing!) behind a `VerifyIntrinsicChecks` VM flag >> >> Following preliminary work needs to be carried out as well: >> >> 1. Add a new `VerifyIntrinsicChecks` VM flag >> 2. Update `generate_string_range_check` to produce a `HaltNode`. That is, crash the VM if `VerifyIntrinsicChecks` is set and a Java wrapper fails to spot an invalid input. >> >> 1 `@IntrinsicCandidate`-annotated constructors are not subject to this change, since they are a special case. >> >> ## Functional and performance tests >> >> - `tier1` (which includes `test/hotspot/jtreg/compiler/intrinsics/string`) passes on several platforms. Further tiers will be executed after integrating reviewer feedback. >> >> - Performance impact is still actively monitored using `test/micro/org/openjdk/bench/java/lang/String{En,De}code.java`, among other tests. If you have suggestions on benchmarks, please share in the comments. >> >> ## Verification of the VM crash >> >> I've tested the VM crash scenario as follows: >> >> 1. Created the following test program: >> >> public class StrIntri { >> public static void main(String[] args) { >> Exception lastException = null; >> for (int i = 0; i < 1_000_000; i++) { >> try { >> jdk.internal.access.SharedSecrets.getJavaLangAccess().countPositives(new byte[]{1,2,3}, 2, 5); >> } catch (Exception exception) { >> lastException = exception; >> } >> } >> if (lastException != null) { >> lastException.printStackTrace... > > Volkan Yazici has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into strIntrinCheck > - Replace `requireNonNull` with implicit null checks to reduce bytecode size > - Add `@bug` tags > - Improve wording of `@param len` > - Make source array bound checks lenient too > - Cap destination array bounds > - Fix bit shifting > - Remove superseded `@throws` Javadoc > - Merge remote-tracking branch 'upstream/master' into strIntrinCheck > - Make `StringCoding` encoding intrinsics lenient > - ... and 21 more: https://git.openjdk.org/jdk/compare/5917e59b...c322f0e0 src/java.base/share/classes/java/lang/StringCoding.java line 99: > 97: * {@linkplain Preconditions#checkFromIndexSize(int, int, int, BiFunction) out of bounds} > 98: */ > 99: static int countPositives(byte[] ba, int off, int len) { If we name countPositives with parameter checking as countPositivesSB, this PR will have fewer changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25998#discussion_r2255894472 From dholmes at openjdk.org Wed Aug 6 05:41:10 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 05:41:10 GMT Subject: RFR: 8361103: java_lang_Thread::async_get_stack_trace does not properly protect JavaThread [v7] In-Reply-To: <6MCWqdixHmPtZNQd86-_KSz80enDwWQW9AJEJsgnClI=.e27a474a-dc2d-4e7e-8436-789e65af3031@github.com> References: <6MCWqdixHmPtZNQd86-_KSz80enDwWQW9AJEJsgnClI=.e27a474a-dc2d-4e7e-8436-789e65af3031@github.com> Message-ID: On Tue, 5 Aug 2025 22:34:46 GMT, Alex Menkov wrote: >> The fix updates `java_lang_Thread::async_get_stack_trace()` (used by `java.lang.Thread.getStackTrace()` to get stack trace for platform and mounted virtual threads) to correctly use `ThreadListHandle` for thread protection. >> >> Testing: tier1..5 > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > comment to async_get_stack_trace Looks good (a couple of typos). Thanks. src/hotspot/share/classfile/javaClasses.cpp line 1887: > 1885: // Obtain stack trace for platform or mounted virtual thread. > 1886: // If jthread is a virtual thread and it has been unmounted (or remounted to different carrier) the method returns null. > 1887: // The caller (java.lang.VirtulThread) handles retuned nulls via retry. Suggestion: // The caller (java.lang.VirtualThread) handles returned nulls via retry. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26119#pullrequestreview-3090401802 PR Review Comment: https://git.openjdk.org/jdk/pull/26119#discussion_r2255852479 From stuefe at openjdk.org Wed Aug 6 05:45:06 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Aug 2025 05:45:06 GMT Subject: RFR: 8340297: Use-after-free recognition for metaspace and class space [v2] In-Reply-To: References: Message-ID: On Mon, 28 Jul 2025 05:43:01 GMT, Thomas Stuefe wrote: >> This patch will give us use-after-free recognition for Metaspace and Class space. >> >> Currently, checks for Klass validity typically only perform a variation of `Metaspace::contains` and some other basic tests. These checks won't find cases where the Klass had been prematurely freed (e.g., after class redefinition), nor cases of unloaded classes if the underlying metaspace chunks have not been uncommitted, which is quite common. >> >> The patch also provides us with improved analysis methods in case we encounter problems. E.g., answering whether the Klass had been redefined or unloaded. >> >> The implementation aims to be simple, fast, and safe against false positives. There is a small but non-null chance that we could get false negatives, but that cannot be avoided. >> >> How this works: >> >> - In `class Metadata`, we introduce a 32-bit token that holds the type of the object (1). It replaces the old "is_valid" field of the same size. That one was of limited use since any non-null garbage in those four bytes would be read as valid. >> - To check a Metadata for validity, the token is checked. Checks are done with SafeFetch, so they can be done with questionable pointers (e.g. into uncommitted metaspace after class unloading) >> - When metaspace is freed (bulk free after class unloading), the released chunks are zapped, destroying all tokens in the area. >> - When metaspace is freed (prematurely, e.g., after class redefinition), the released blocks are zapped. >> - The new checks replace Metadata::is_valid and supplement some other metadata checks done in GCs >> >> Testing: The patch has been extensively tested manually, at Oracle, and SAP. Tests were thorough to not only catch errors in the patch, but also to see if the patch would uncover a lot of existing sleeper bugs. So far, we only found a single bug in Shenandoah. >> >> Note: I did not yet hook up the new test to c1/c2 compiled code (there are already unimplemented functions for that). That is possible, but left for a later RFE. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - merge master > - copyrights > - fix big-endian problem on AIX > - Update klass.cpp > - Update metaspace.hpp > - Update metaspace.hpp > - Update metaspace.hpp > - fix rebase error > - fix mac build > - rebase, fixes Friendly Ping ------------- PR Comment: https://git.openjdk.org/jdk/pull/25891#issuecomment-3157478659 From jsjolen at openjdk.org Wed Aug 6 06:18:13 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 6 Aug 2025 06:18:13 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v4] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 5 Aug 2025 16:45:26 GMT, Ashutosh Mehra wrote: >> src/hotspot/share/utilities/ostream.hpp line 172: >> >>> 170: // Stops when gen returns null or when buffer is out of space. >>> 171: template >>> 172: void join(Generator gen, const char* separator) { >> >> We can move this to the `outputStream` class instead! > > It is in the `outputStream` class Oops, I thought you moved it to the stringStream class, my bad :). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26515#discussion_r2255968128 From jiefu at openjdk.org Wed Aug 6 06:59:02 2025 From: jiefu at openjdk.org (Jie Fu) Date: Wed, 6 Aug 2025 06:59:02 GMT Subject: RFR: 8364597: Replace THL A29 Limited with Tencent In-Reply-To: References: Message-ID: <08Gigv9qz5H5c80w-WLegQ_8KVI4xoY-kZ374S8t5ls=.35ded307-c917-4c1b-9ab0-032499140a67@github.com> On Mon, 4 Aug 2025 07:38:01 GMT, John Jiang wrote: > `THL A29 Limited` was a `Tencent` company but was dissolved. > So, the copyright notes just use `Tencent` directly. LGTM ------------- Marked as reviewed by jiefu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26614#pullrequestreview-3090664823 From dholmes at openjdk.org Wed Aug 6 07:01:06 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 07:01:06 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 10:16:32 GMT, Johan Sj?len wrote: >> src/hotspot/share/interpreter/bytecodeUtils.cpp line 1483: >> >>> 1481: // Is an explicit slot given? >>> 1482: bool explicit_search = slot >= 0; >>> 1483: if (explicit_search) { >> >> Suggestion: >> >> if (slot >= 0) { >> >> No need for the temporary local. > > If we don't adopt the enum I suggested, then I do prefer the naming of the condition through a variable. It gives you a hint of what the check is looking for, and not only what the check does. That is what comments are for. >> src/java.base/share/classes/java/lang/NullPointerException.java line 78: >> >>> 76: NullPointerException(int stackOffset, int searchSlot) { >>> 77: extendedMessageState = setupCustomBackTrace(stackOffset, searchSlot); >>> 78: this(); >> >> I thought `this()` had to come first in a constructor? Is this new in 25 or 26? > > https://openjdk.org/jeps/492 And now final - https://openjdk.org/jeps/513 Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2256053422 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2256047794 From jsjolen at openjdk.org Wed Aug 6 07:48:23 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 6 Aug 2025 07:48:23 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: Message-ID: <7IGhtSwHIBGIY1j6nJCm3Sjv6l28aHypbC98PwPL_Ak=.5d270255-7419-4754-9775-9fc546dae1e2@github.com> On Wed, 6 Aug 2025 06:57:55 GMT, David Holmes wrote: >> If we don't adopt the enum I suggested, then I do prefer the naming of the condition through a variable. It gives you a hint of what the check is looking for, and not only what the check does. > > That is what comments are for. I'd rather explain things with code than with comments when possible. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2256207131 From tschatzl at openjdk.org Wed Aug 6 08:14:04 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Aug 2025 08:14:04 GMT Subject: RFR: 8364767: G1: Remove use of CollectedHeap::_soft_ref_policy In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 18:18:34 GMT, Albert Mingkun Yang wrote: > Use gc-cause checking in `VM_G1CollectFull` to decide soft-ref policy. > > Test: tier1-3 The other comment may be handled separately. src/hotspot/share/gc/g1/g1VMOperations.cpp line 54: > 52: GCCauseSetter x(g1h, _gc_cause); > 53: bool clear_all_soft_refs = _gc_cause == GCCause::_metadata_GC_clear_soft_refs || > 54: _gc_cause == GCCause::_wb_full_gc; Now that three collectors use this, it might be useful to factor this out as a helper function in `CollectedHeap` and use it. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26648#pullrequestreview-3091140646 PR Review Comment: https://git.openjdk.org/jdk/pull/26648#discussion_r2256297755 From mhaessig at openjdk.org Wed Aug 6 08:34:05 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 6 Aug 2025 08:34:05 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: <1YTjfZh0C7UWqTCXWcNS-Q68kBOnhlRJA_-sS_vZsio=.6389ba30-8f3a-48c2-b0e2-40f5877f4a36@github.com> References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> <1YTjfZh0C7UWqTCXWcNS-Q68kBOnhlRJA_-sS_vZsio=.6389ba30-8f3a-48c2-b0e2-40f5877f4a36@github.com> Message-ID: On Tue, 5 Aug 2025 21:33:52 GMT, Dean Long wrote: >> The compiler thread setting and unsetting the flag and the signal handler reading the flag are racing each other as soon as the timer is set, since signals are preemptive. This prevents a few false positive timeouts on architectures with weak memory models, but does not have any effect on x86 for example. > > The signal is delivered to the same compiler thread, so I don't think acquire and release help here. There shouldn't be a weak memory model issue between a thread and it's signal handler on the same thread. Can you explain the sequence of events that could cause a false positive? As far as I understand weak memory models, the concept of threads, which the CPU does not know of, is irrelevant. The CPU is merely free to reorder a subset of memory operations. The reason I added the acquire/release semantics is to prevent the reordering of a write to `_timeout_armed` that happened before a read from the same in the signal handler. This is only relevant when a compilation is done and the timeout is disarmed just before the timer fires. Granted, this is an edge case that does not create a huge amount of false positives. Also, it might be entirely unnecessary due to the OS doing some work to call the signal handler in-between the write and the read of the value. It is merely conservative programming, so I am fine with dropping the barriers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2256370628 From adinn at openjdk.org Wed Aug 6 08:38:06 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 6 Aug 2025 08:38:06 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache In-Reply-To: <1xrYsWuO2i0clJuaenUnLY9U24K4IvY22ATv3WiHQuk=.0687a2e4-7723-432a-bd3b-b98451b3c376@github.com> References: <1xrYsWuO2i0clJuaenUnLY9U24K4IvY22ATv3WiHQuk=.0687a2e4-7723-432a-bd3b-b98451b3c376@github.com> Message-ID: On Tue, 5 Aug 2025 23:22:34 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/opto/c2compiler.cpp line 95: >> >>> 93: // If there was an error generating the blob then UseCompiler will >>> 94: // have been unset and we need to skip the remaining initialization >>> 95: if (UseCompiler) { >> >> Inverting the if condition here would result in slightly more readable code: you could early return `false`, avoid the additional indentation, and the comment at L93-94 would refer to the true branch of the `if` statement. > > Should we also check it before calling `compiler_stubs_init()` too? We could but I don't think it makes much difference. This is a race we could lose at any time. If we have already lost at the point of call then the called method will attempt the buffer allocate and fail almost immediately. So, checking here doesn't help avoid any real work. I will modify the logic to return early -- although not because I agree that the current code is difficult to read. But it is important to preserve the current indentation, making the cumulative change history simpler to follow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26642#discussion_r2256385920 From aph at openjdk.org Wed Aug 6 08:45:03 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 6 Aug 2025 08:45:03 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: <62G2O_C4dHe0yNocFL7vmma7XWoyr9rTGO2p7Y6i_Z0=.15588516-31e5-4d8c-9003-92cd45fe1dd4@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> <62G2O_C4dHe0yNocFL7vmma7XWoyr9rTGO2p7Y6i_Z0=.15588516-31e5-4d8c-9003-92cd45fe1dd4@github.com> Message-ID: <4vmsKSTm9EbwBvumxVmTBl857VH9mGdmNOvj8PeHR3s=.bc1255dd-9375-47c1-8fa3-84069835753e@github.com> On Tue, 5 Aug 2025 21:40:27 GMT, Dean Long wrote: > In the past we would have scheduled the load for x8 earlier to avoid a latency hit for using the value in the next instruction: > > ``` > ldr x8, Label_entry > ldr x12, Label_data > br x8 > ``` > > Is this still helpful on modern aarch64 hardware? Maybe on something very small, such as the Cortex-A34. It doesn't hurt larger out-of-order cores, so I agree with you that we should reorder this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3158299392 From adinn at openjdk.org Wed Aug 6 08:50:57 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 6 Aug 2025 08:50:57 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache [v2] In-Reply-To: References: Message-ID: > This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: minimize changed lines and align log messages ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26642/files - new: https://git.openjdk.org/jdk/pull/26642/files/5889ce2b..7a28bc7e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26642&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26642&range=00-01 Stats: 16 lines in 2 files changed: 7 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26642.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26642/head:pull/26642 PR: https://git.openjdk.org/jdk/pull/26642 From adinn at openjdk.org Wed Aug 6 08:50:58 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 6 Aug 2025 08:50:58 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 13:21:01 GMT, Andrew Dinn wrote: > This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. New version addresses all comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26642#issuecomment-3158332733 From duke at openjdk.org Wed Aug 6 08:56:10 2025 From: duke at openjdk.org (ExE Boss) Date: Wed, 6 Aug 2025 08:56:10 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 06:55:54 GMT, David Holmes wrote: >> https://openjdk.org/jeps/492 > > And now final - https://openjdk.org/jeps/513 > > Thanks It?s?necessary to?do?this?before the?`this()`/`super()`?call, as?the?`Throwable(?)` constructors?call `this.fillInStackTrace()`: https://github.com/openjdk/jdk/blob/e304d37996b075b8b2b44b5762d7d242169add49/src/java.base/share/classes/java/lang/Throwable.java#L263-L264 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2256443245 From fgao at openjdk.org Wed Aug 6 09:19:51 2025 From: fgao at openjdk.org (Fei Gao) Date: Wed, 6 Aug 2025 09:19:51 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD Message-ID: If the relocation or target address is guaranteed to reside within the CodeCache, we can safely replace a `movz + movk + movk` sequence with a more compact and efficient `adrp + add` instruction pair. In `MacroAssembler::mov(Register r, Address dest)`, this replacement can be applied if any of the following rules hold: 1. The relocation type indicates that the address resides within the CodeCache and the necessary patching logic is provided in `fix_relocation_after_move()`. 2. The target address is fixed (i.e., does not require relocation) and is within the reachable range for `adrp`. The patch performs the filtering in `is_relocated_within_codecache()` and `is_adrp_reachable()` to ensure this optimization is applied safely and selectively. ------------- Commit messages: - 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD Changes: https://git.openjdk.org/jdk/pull/26653/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26653&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8362504 Stats: 65 lines in 2 files changed: 59 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/26653.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26653/head:pull/26653 PR: https://git.openjdk.org/jdk/pull/26653 From duke at openjdk.org Wed Aug 6 09:27:29 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 09:27:29 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v26] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static size_t physical_memory(size_t& value); > > > The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, an unsuccessful call is logged. > > `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Addressed reviewer's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/fbceea99..f9d9bc4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=24-25 Stats: 16 lines in 2 files changed: 2 ins; 3 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Wed Aug 6 09:27:30 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 09:27:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: References: <7685_Xh8MjRqxoEx2BWfpTcdPXq5-aDuo0XZfTcco5w=.00416a73-5237-4381-a79d-8973f47b19ed@github.com> Message-ID: On Wed, 6 Aug 2025 03:50:43 GMT, David Holmes wrote: >> I think this would change the usage pattern again. Do we want a failing assert in every memory function in case of failure? I thought the consensus is that a false values returned by the function is indicating that. In this particular case it will be easy to trace which method failed as their results are stored as local variables. > > The assert should go after the syscall that we don't expect to fail, but which we have to tolerate the possibility of failing. That is the place where we can also report why the call failed. > > I'm not sure what you mean about changing the usage pattern. What I meant is that in this PR we are focusing on adding a mechanism to indicate that something failed, not on reporting why that fail. Addressed as suggested. >> I suggest to keep results as locals, see my previous comment. > > Sorry I don't see a connection to previous comment about the assert. You only need a local if you need to refer to something more than once. With the locals it would be visible in debugging which of those methods failed if any. Addressed as suggested. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2256537890 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2256537325 From duke at openjdk.org Wed Aug 6 09:27:30 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 09:27:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v25] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 03:42:19 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8357086: Addressed reviewer's comments > > src/hotspot/os/linux/os_linux.cpp line 299: > >> 297: jlong memory_and_swap_limit_in_bytes = OSContainer::memory_and_swap_limit_in_bytes(); >> 298: jlong memory_limit_in_bytes = OSContainer::memory_limit_in_bytes(); >> 299: value = static_cast(memory_and_swap_limit_in_bytes - memory_limit_in_bytes); > > This isn't what I meant. You want to save into the locals before each of the `if` statements, else you are still calling these functions twice. Right, I changed the way it is done as suggested. > src/hotspot/os/windows/os_windows.cpp line 869: > >> 867: return true; >> 868: } else { >> 869: errno = ::GetLastError(); > > Why are you setting `errno` here and elsewhere? I see a couple of places where we do do that though I do not know why, but generally we just want to save the `::GetlastError()` value (here we do need a local) and print it. You are correct, there is no relationship between `errno` and `GetLastError()`. Addressed. I created a separate issue to fix this inconsistency in other places. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2256538537 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2256536814 From aph at openjdk.org Wed Aug 6 09:33:04 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 6 Aug 2025 09:33:04 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 5 Aug 2025 21:47:34 GMT, Dean Long wrote: > > Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. > > I thought other threads are modifying the data, which is the reason why we needed the ISB. Yes, exactly. Other threads are executing this nmethod while we are modifying its static call stub. We hold the lock while we're doing this, so any thread racing with us to _modify_ the call stub would be blocked. However, another thread could execute the call we just patched but not fully observe the changes we made to the stub, which is why we need the ISB. We have a similar situation with this PR, but with data rather than instruction memory. We need to make sure that any racing thread observes changes to the stub before it observes the patched call to the stub. There is no ordering between instruction and data memory, which is why the current stub uses only instruction memory, keeping everything right. The ISB ensures that the executing thread observes the changed stub: there is a happens-before from the `ICache::invalidate_range` after patching the stub (but before patching the call) to the `isb` at the start of the stub. For this PR to be correctly synchronized, we'd need a similar happens-before from patching the constants used in the stub to executing the stub. I can think of a way to do that, but it's fiddly and may not be worth it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26638#discussion_r2256564579 From cnorrbin at openjdk.org Wed Aug 6 09:40:16 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Wed, 6 Aug 2025 09:40:16 GMT Subject: RFR: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree Message-ID: Hi everyone, The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter constraints, the treap can be removed in favour of it. I made some modifications to the red-black tree to make it compatible with previous treap usages: - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. Changes to NMT include: - Modified components to align with the updated const-correctness of the red-black tree functions - Renamed structures and variables to remove "treap" from their names to reflect the new tree The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. ------------- Commit messages: - Replace nmttreap with rbtree Changes: https://git.openjdk.org/jdk/pull/26655/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26655&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352067 Stats: 1016 lines in 14 files changed: 124 ins; 816 del; 76 mod Patch: https://git.openjdk.org/jdk/pull/26655.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26655/head:pull/26655 PR: https://git.openjdk.org/jdk/pull/26655 From duke at openjdk.org Wed Aug 6 09:48:28 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 09:48:28 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v27] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static size_t physical_memory(size_t& value); > > > The return boolean value, where available, indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, an unsuccessful call is logged. > > `ATTRIBUTE_NODISCARD` macro is added as a placeholder for `[[nodiscard]]`, which will be available with C++17. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Fixed whitespace error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/f9d9bc4a..f62ed1d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=25-26 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From jsjolen at openjdk.org Wed Aug 6 10:00:04 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 6 Aug 2025 10:00:04 GMT Subject: RFR: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 09:29:05 GMT, Casper Norrbin wrote: > Hi everyone, > > The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter time complexity, the treap can be removed in favour of it. > > I made some modifications to the red-black tree to make it compatible with previous treap usages: > - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. > - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. > > Changes to NMT include: > - Modified components to align with the updated const-correctness of the red-black tree functions > - Renamed structures and variables to remove "treap" from their names to reflect the new tree > > The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. This LGTM. What testing did you run? src/hotspot/share/nmt/vmatree.hpp line 196: > 194: > 195: public: > 196: using VMARBTree = RBTreeCHeap; You should be abl to just write 'mtNMT', same for all other `MemTag::` prefixes. src/hotspot/share/opto/printinlining.cpp line 90: > 88: > 89: return node->val(); > 90: } Could you expand the `auto`s? It should be a code action in VSCode if you use clangd. We tend to only use auto when it's a lambda. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26655#pullrequestreview-3091786865 PR Review Comment: https://git.openjdk.org/jdk/pull/26655#discussion_r2256636655 PR Review Comment: https://git.openjdk.org/jdk/pull/26655#discussion_r2256634808 From aph at openjdk.org Wed Aug 6 10:02:03 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 6 Aug 2025 10:02:03 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: References: Message-ID: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> On Wed, 6 Aug 2025 09:13:30 GMT, Fei Gao wrote: > If the relocation or target address is guaranteed to reside within the CodeCache, we can safely replace a `movz + movk + movk` sequence with a more compact and efficient `adrp + add` instruction pair. > > In `MacroAssembler::mov(Register r, Address dest)`, this replacement can be applied if any of the following rules hold: > > 1. The relocation type indicates that the address resides within the CodeCache and the necessary patching logic is provided in `fix_relocation_after_move()`. > 2. The target address is fixed (i.e., does not require relocation) and is within the reachable range for `adrp`. > > The patch performs the filtering in `is_relocated_within_codecache()` and `is_adrp_reachable()` to ensure this optimization is applied safely and selectively. It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3158946110 From duke at openjdk.org Wed Aug 6 10:10:14 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 10:10:14 GMT Subject: RFR: 8364819: Cleanup handshake_timed_out_thread and safepoint_timed_out_thread Message-ID: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Hi, please consider the following changes: This is a cleanup after https://github.com/openjdk/jdk/pull/26309 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. 2) Added a missed brace in the error message. Trivial change. ------------- Commit messages: - 8364819: Cleanup Changes: https://git.openjdk.org/jdk/pull/26656/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364819 Stats: 16 lines in 4 files changed: 0 ins; 0 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/26656.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26656/head:pull/26656 PR: https://git.openjdk.org/jdk/pull/26656 From adinn at openjdk.org Wed Aug 6 10:20:02 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 6 Aug 2025 10:20:02 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: References: Message-ID: <7aQZ9mKf7PrVcYmsTZZdYp9BgmcNU5pMdu0Tyf_GHGs=.e04966ee-2515-4e0e-9714-4ffc35025d7f@github.com> On Wed, 6 Aug 2025 09:13:30 GMT, Fei Gao wrote: > If the relocation or target address is guaranteed to reside within the CodeCache, we can safely replace a `movz + movk + movk` sequence with a more compact and efficient `adrp + add` instruction pair. > > In `MacroAssembler::mov(Register r, Address dest)`, this replacement can be applied if any of the following rules hold: > > 1. The relocation type indicates that the address resides within the CodeCache and the necessary patching logic is provided in `fix_relocation_after_move()`. > 2. The target address is fixed (i.e., does not require relocation) and is within the reachable range for `adrp`. > > The patch performs the filtering in `is_relocated_within_codecache()` and `is_adrp_reachable()` to ensure this optimization is applied safely and selectively. I agree with Andrew that this may well be slower even if it does save on code size. So, the trade-off being made here is not clear-cut. I also think we need to consider and allow for save and restore of adapter, stub and, in upcoming releases, nmethod code using the AOT cache. Both code cache to code cache (runtime) calls and code cache to foreign (external) calls may require a one-off offset adjustment during AOT code loading. This can happen because 1) code generated at some given address and saved from the Assembly VM may be restored in a Production VM at a different address and 2) references from generated code to a method/function address in some given external C library loaded in the Assembly VM may need adjustment to allow for the library and associated function/method being loaded at a different address. This reloc at reload should not be an issue for generated calls to the runtime since wherever the saved code is reloaded the call offset will still be less than the cache range and that will be less than 2GB. So, we can always patch an 'adrp + add' with another 'adrp + add' However, it is significant for foreign calls where restoration at a different place in the cache and/or randomization of library load addresses implies that a positive reachability decision made at Assembly time might no longer hold in a Production run. In that case an 'adrp + add' pair would not be able to be simply patched to a 'movz + movk + movk' triple. The solution is to make the compiler always generate the 3-instruction load when compiling in an Assembly VM and otherwise generate the 2-instruction load based on reachability i.e. AOT code won't 'benefit' from this patch but runtime generated code will (assuming it is a benefit). So, the reachability method needs to return false if `AOTCodeCache::is_on_for_dump() returns true. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3159151855 From adinn at openjdk.org Wed Aug 6 10:36:04 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 6 Aug 2025 10:36:04 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Wed, 6 Aug 2025 09:30:07 GMT, Andrew Haley wrote: >>> Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. >> >> I thought other threads are modifying the data, which is the reason why we needed the ISB. > >> > Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. >> >> I thought other threads are modifying the data, which is the reason why we needed the ISB. > > Yes, exactly. Other threads are executing this nmethod while we are modifying its static call stub. > > We hold the lock while we're doing this, so any thread racing with us to _modify_ the call stub would be blocked. However, another thread could execute the call we just patched but not fully observe the changes we made to the stub, which is why we need the ISB. > > We have a similar situation with this PR, but with data rather than instruction memory. We need to make sure that any racing thread observes changes to the stub before it observes the patched call to the stub. There is no ordering between instruction and data memory, which is why the current stub uses only instruction memory, keeping everything right. The ISB ensures that the executing thread observes the changed stub: there is a happens-before from the `ICache::invalidate_range` after patching the stub (but before patching the call) to the `isb` at the start of the stub. > > For this PR to be correctly synchronized, we'd need a similar happens-before from patching the constants used in the stub to executing the stub. I can think of a way to do that, but it's fiddly and may not be worth it. Yes, it's a very subtle point that the isb serves to isolate threads executing the call from partial updates by of the writing thread i.e. make the change to the call site atomic. It might be useful to have that written in some comment in the patching code rather than just documented in this issue (but it helps to have it said here). I understand that an isb is relatively expensive and I don't doubt the figures provided from running the micro-benchmark. However, they are not necessarily any indication that there is a problem here. How many such call sites are there and how often do they get patched? Is that significant relative relative to, say, the number of times they are executed or some other measure of overall cost? Is there evidence that we really need to fix this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26638#discussion_r2256731349 From shade at openjdk.org Wed Aug 6 10:41:02 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Aug 2025 10:41:02 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 In-Reply-To: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: <1wywkzcGrfAM2m-1EGH6cKvocvHD6hvgKo8uALsi-M8=.e6f51458-30ba-449d-ad3b-0ce5ed37bbd1@github.com> On Wed, 6 Aug 2025 10:05:08 GMT, Anton Artemov wrote: > Hi, please consider the following changes: > > This is a cleanup after https://github.com/openjdk/jdk/pull/26309 > > 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. > > 2) Added a missed brace in the error message. > > Trivial change. Also a closing parenthesis in test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java message matcher, I think? A better name for the PR and ticket is: "8364819: Post-integration cleanups for JDK-8359820" ------------- PR Comment: https://git.openjdk.org/jdk/pull/26656#issuecomment-3159406353 PR Comment: https://git.openjdk.org/jdk/pull/26656#issuecomment-3159417870 From duke at openjdk.org Wed Aug 6 10:48:43 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 10:48:43 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> > Hi, please consider the following changes: > > This is a cleanup after https://github.com/openjdk/jdk/pull/26309 > > 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. > > 2) Added a missed brace in the error message. > > Trivial change. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8364819: Fixed brace in the test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26656/files - new: https://git.openjdk.org/jdk/pull/26656/files/f7aa7a7c..8c2915bb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26656.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26656/head:pull/26656 PR: https://git.openjdk.org/jdk/pull/26656 From shade at openjdk.org Wed Aug 6 10:48:43 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Aug 2025 10:48:43 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> Message-ID: On Wed, 6 Aug 2025 10:45:01 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Fixed brace in the test This looks fine to me, thanks! But @dholmes-ora should take another look, since he reviewed the original patch. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3091955121 From duke at openjdk.org Wed Aug 6 10:48:43 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 10:48:43 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 In-Reply-To: <1wywkzcGrfAM2m-1EGH6cKvocvHD6hvgKo8uALsi-M8=.e6f51458-30ba-449d-ad3b-0ce5ed37bbd1@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1wywkzcGrfAM2m-1EGH6cKvocvHD6hvgKo8uALsi-M8=.e6f51458-30ba-449d-ad3b-0ce5ed37bbd1@github.com> Message-ID: On Wed, 6 Aug 2025 10:37:32 GMT, Aleksey Shipilev wrote: > Also a closing parenthesis in test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java message matcher, I think? Thanks for spotting, yes, that one has to be changed. Done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26656#issuecomment-3159472025 From rrich at openjdk.org Wed Aug 6 10:58:16 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 6 Aug 2025 10:58:16 GMT Subject: RFR: 8360219: [AIX] assert(locals_base >= l2) failed: bad placement Message-ID: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> Weaken assertion because it is too strict. While the interpreted caller sometimes has a `frame::top_ijava_frame_abi` it is sufficient to assert that the locals don't overlap with the smaller `frame::parent_ijava_frame_abi` because only that's reserved for non-top frames (aka `parent` frames). Tested on AIX and Linux ppc: Tier 1-4 of hotspot and jdk. All of langtools and jaxp. Renaissance Suite and SAP specific tests. Details: It cannot be assumed that the interpreted caller of the bottom interpreted frame (from a compiled deoptee frame) has a large `frame::top_ijava_frame_abi`. In an ordinary i2c call it would keep its `frame::top_ijava_frame_abi` (see [`call_from_interpreter`](https://github.com/openjdk/jdk/blob/67ba8b45dd632c40d5e6872d2a6ce24f86c22152/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp#L1217)) but when it was thawed then it'll have only a `frame::java_abi` (alias for parent_ijava_frame_abi). There are [diagrams](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L395) commenting `ThawBase::new_stack_frame()` that show this. It's not easy to see it in the code though. Note that the [frame size](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L496) is calculated relative to hf's unextended_sp which [includes frame::metadata_words](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/stackChunkFrameStream_ppc.inline.hpp#L80) which is the size of `java_abi`. ------------- Commit messages: - Fix comment - Weaken assertion. The bottom frame's caller can have the smaller parent_ijava_frame_abi. Changes: https://git.openjdk.org/jdk/pull/26643/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26643&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360219 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26643/head:pull/26643 PR: https://git.openjdk.org/jdk/pull/26643 From cnorrbin at openjdk.org Wed Aug 6 11:07:42 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Wed, 6 Aug 2025 11:07:42 GMT Subject: RFR: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree [v2] In-Reply-To: References: Message-ID: > Hi everyone, > > The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter time complexity, the treap can be removed in favour of it. > > I made some modifications to the red-black tree to make it compatible with previous treap usages: > - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. > - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. > > Changes to NMT include: > - Modified components to align with the updated const-correctness of the red-black tree functions > - Renamed structures and variables to remove "treap" from their names to reflect the new tree > > The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: feedback fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26655/files - new: https://git.openjdk.org/jdk/pull/26655/files/97107f5b..a495e291 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26655&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26655&range=00-01 Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26655.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26655/head:pull/26655 PR: https://git.openjdk.org/jdk/pull/26655 From dholmes at openjdk.org Wed Aug 6 12:01:04 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 12:01:04 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> <1YTjfZh0C7UWqTCXWcNS-Q68kBOnhlRJA_-sS_vZsio=.6389ba30-8f3a-48c2-b0e2-40f5877f4a36@github.com> Message-ID: On Wed, 6 Aug 2025 08:31:26 GMT, Manuel H?ssig wrote: >> The signal is delivered to the same compiler thread, so I don't think acquire and release help here. There shouldn't be a weak memory model issue between a thread and it's signal handler on the same thread. Can you explain the sequence of events that could cause a false positive? > > As far as I understand weak memory models, the concept of threads, which the CPU does not know of, is irrelevant. The CPU is merely free to reorder a subset of memory operations. The reason I added the acquire/release semantics is to prevent the reordering of a write to `_timeout_armed` that happened before a read from the same in the signal handler. This is only relevant when a compilation is done and the timeout is disarmed just before the timer fires. > > Granted, this is an edge case that does not create a huge amount of false positives. Also, it might be entirely unnecessary due to the OS doing some work to call the signal handler in-between the write and the read of the value. It is merely conservative programming, so I am fine with dropping the barriers. That is not an accurate characterisation IMO - a memory model defines the consistency of the views of memory by different threads of execution. Any given thread of execution must always have a self-consistent view of memory, and neither the compiler or the CPU is allowed to violate that if code is actually going to be guaranteed to work correctly. If the signal is being handled in the same thread that is setting the flag then you cannot have a memory ordering issue. And acquire/release is not the right tool just to establish an ordering: you would generally want a storestore barrier on the writer side, and a loadload barrier on the reader side. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2256920642 From dholmes at openjdk.org Wed Aug 6 12:11:10 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 12:11:10 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v24] In-Reply-To: References: <7685_Xh8MjRqxoEx2BWfpTcdPXq5-aDuo0XZfTcco5w=.00416a73-5237-4381-a79d-8973f47b19ed@github.com> Message-ID: On Wed, 6 Aug 2025 09:22:33 GMT, Anton Artemov wrote: >> The assert should go after the syscall that we don't expect to fail, but which we have to tolerate the possibility of failing. That is the place where we can also report why the call failed. >> >> I'm not sure what you mean about changing the usage pattern. > > What I meant is that in this PR we are focusing on adding a mechanism to indicate that something failed, not on reporting why that fail. > > Addressed as suggested. If we are going to assert on failure then we should report as much information about the failure as possible. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2256943006 From dholmes at openjdk.org Wed Aug 6 12:11:11 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 12:11:11 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v25] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 09:22:12 GMT, Anton Artemov wrote: >> src/hotspot/os/windows/os_windows.cpp line 869: >> >>> 867: return true; >>> 868: } else { >>> 869: errno = ::GetLastError(); >> >> Why are you setting `errno` here and elsewhere? I see a couple of places where we do do that though I do not know why, but generally we just want to save the `::GetlastError()` value (here we do need a local) and print it. > > You are correct, there is no relationship between `errno` and `GetLastError()`. Addressed. > > I created a separate issue to fix this inconsistency in other places. I had forgotten that this oddity was also raised in the review of [JDK-8350869](https://bugs.openjdk.org/browse/JDK-8350869). If it is part of the `os::xxx` function's specification to set `errno` on failure then we would need to do something to achieve that (which I think was the intended case with something like `os::stat`) though as previously noted the set of allowed values for `errno` and `GetLastErrror` are not the same. But the API's in question here are not specified that way. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2256937209 From mhaessig at openjdk.org Wed Aug 6 12:26:04 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 6 Aug 2025 12:26:04 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> <1YTjfZh0C7UWqTCXWcNS-Q68kBOnhlRJA_-sS_vZsio=.6389ba30-8f3a-48c2-b0e2-40f5877f4a36@github.com> Message-ID: On Wed, 6 Aug 2025 11:58:22 GMT, David Holmes wrote: >> As far as I understand weak memory models, the concept of threads, which the CPU does not know of, is irrelevant. The CPU is merely free to reorder a subset of memory operations. The reason I added the acquire/release semantics is to prevent the reordering of a write to `_timeout_armed` that happened before a read from the same in the signal handler. This is only relevant when a compilation is done and the timeout is disarmed just before the timer fires. >> >> Granted, this is an edge case that does not create a huge amount of false positives. Also, it might be entirely unnecessary due to the OS doing some work to call the signal handler in-between the write and the read of the value. It is merely conservative programming, so I am fine with dropping the barriers. > > That is not an accurate characterisation IMO - a memory model defines the consistency of the views of memory by different threads of execution. Any given thread of execution must always have a self-consistent view of memory, and neither the compiler or the CPU is allowed to violate that if code is actually going to be guaranteed to work correctly. If the signal is being handled in the same thread that is setting the flag then you cannot have a memory ordering issue. > > And acquire/release is not the right tool just to establish an ordering: you would generally want a storestore barrier on the writer side, and a loadload barrier on the reader side. Ah, I see my mistake. 1) I did not think about the C++ memory model, and 2) I thought of the signal handler as a different thread, which made me a bit paranoid. Thank you both for pointing out the mistakes in my thinking. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26023#discussion_r2256979751 From dholmes at openjdk.org Wed Aug 6 12:38:06 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 12:38:06 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> Message-ID: <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> On Wed, 6 Aug 2025 10:48:43 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Fixed brace in the test Thanks for catching these @shipilev . I have to confess I thought there was a reason we stored the thread as just an id rather than a true pointer, but I was mistaken. @toxaart thanks for the speedy follow up here. I have one other nit I just noticed - @shipilev may want to weigh in on it too. src/hotspot/share/utilities/vmError.cpp line 108: > 106: const intptr_t VMError::segfault_address = pd_segfault_address; > 107: volatile Thread* VMError::_handshake_timed_out_thread = nullptr; > 108: volatile Thread* VMError::_safepoint_timed_out_thread = nullptr; I should have picked this up previously but `volatile` generally serves no purpose for inter-thread consistency. If we need to guarantee visibility between the signaller and signalee, then we need a fence to ensure that. Or we can skip the fence (and volatile) and note that this will usually, but perhaps not always, work fine. (I note that `pthread_kill` is not specified as a memory synchronizing operation, but I strongly suspect it has to have its own internal memory barriers that we would be piggy-backing on.) ------------- PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3092261002 PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2256961025 From shade at openjdk.org Wed Aug 6 13:02:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Aug 2025 13:02:04 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> Message-ID: <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> On Wed, 6 Aug 2025 12:15:57 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8364819: Fixed brace in the test > > src/hotspot/share/utilities/vmError.cpp line 108: > >> 106: const intptr_t VMError::segfault_address = pd_segfault_address; >> 107: volatile Thread* VMError::_handshake_timed_out_thread = nullptr; >> 108: volatile Thread* VMError::_safepoint_timed_out_thread = nullptr; > > I should have picked this up previously but `volatile` generally serves no purpose for inter-thread consistency. If we need to guarantee visibility between the signaller and signalee, then we need a fence to ensure that. Or we can skip the fence (and volatile) and note that this will usually, but perhaps not always, work fine. (I note that `pthread_kill` is not specified as a memory synchronizing operation, but I strongly suspect it has to have its own internal memory barriers that we would be piggy-backing on.) I am pretty sure the memory consistency here is irrelevant, given that: a) the `Thread` in question is likely already initialized for a long time, so it is unlikely we will lose anything release-acquire-wise; b) we only use these for pointer comparisons. But since we are here, and in the interest of clarity and avoiding future surprises, we can just summarily wrap these with `Atomic::release_store` and `Atomic::load_acquire` to be extra paranoidly safe. This is a failing path, so we don't care about performance, and would like to avoid a secondary crash in error handler some time in the future, if anyone reads anything from these threads. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2257071709 From mhaessig at openjdk.org Wed Aug 6 13:19:33 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 6 Aug 2025 13:19:33 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v3] In-Reply-To: References: Message-ID: > This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. > > The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. > > Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. > > Testing: > - [ ] Github Actions > - [ ] tier1, tier2 on all platforms > - [ ] tier3, tier4 and Oracle internal testing on Linux fastdebug > - [ ] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' into JDK-8308094-timeout - No acquire release semantics - Factor Linux specific timeout functionality out of share/ - Move timeout disarm above if - Merge branch 'master' into JDK-8308094-timeout - Fix SIGALRM test - Add timeout functionality to compiler threads ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26023/files - new: https://git.openjdk.org/jdk/pull/26023/files/5840cc2e..d50231f9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=01-02 Stats: 67038 lines in 1707 files changed: 39117 ins; 18664 del; 9257 mod Patch: https://git.openjdk.org/jdk/pull/26023.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26023/head:pull/26023 PR: https://git.openjdk.org/jdk/pull/26023 From mhaessig at openjdk.org Wed Aug 6 13:19:34 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 6 Aug 2025 13:19:34 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v2] In-Reply-To: References: <_Ye19u_7PlqlsoRSuR0dNeAGbeuHyN_oqD1ZS4q9Nvk=.b94fd29d-d43e-4561-9926-7f5a46434d8e@github.com> Message-ID: On Tue, 5 Aug 2025 04:04:01 GMT, Dean Long wrote: >> Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8308094-timeout >> - Fix SIGALRM test >> - Add timeout functionality to compiler threads > > This looks correct, but would it be possible to move the Linux-specific code out of src/hotspot/share? Thank you for reviewing this PR, @dean-long. For v2 I factored the Linux-specific code out of hotspot/share, moved the disarm above the if, and removed the acquire/release. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26023#issuecomment-3160140787 From asmehra at openjdk.org Wed Aug 6 14:05:08 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Wed, 6 Aug 2025 14:05:08 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v5] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 5 Aug 2025 18:08:22 GMT, Ashutosh Mehra wrote: >> This PR implements the code for cpu features names string using ~FormatBuffer~ stringStream. ~It also improves and extends FormatBuffer with an additional method that appends comma separated strings to the buffer.~ It also adds a method to stringStream that that appends separated strings separated by a delimiter to the buffer. This is useful in creating the cpu features names string. >> This code will also be useful in Leyden to implement cpu feature check for AOTCodeCache [0]. >> Platforms affected: x86-64 and aarch64 >> Other platforms can be done if and when Leyden changes are ported to them. >> >> [0] https://github.com/openjdk/leyden/pull/84 > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments > > Signed-off-by: Ashutosh Mehra Can I get one more review of this please. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26515#issuecomment-3160311221 From duke at openjdk.org Wed Aug 6 14:06:19 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 14:06:19 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v3] In-Reply-To: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: > Hi, please consider the following changes: > > This is a cleanup after https://github.com/openjdk/jdk/pull/26309 > > 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. > > 2) Added a missed brace in the error message. > > Trivial change. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8364819: Made handling of timed out thread safe ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26656/files - new: https://git.openjdk.org/jdk/pull/26656/files/8c2915bb..594654e4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=01-02 Stats: 14 lines in 2 files changed: 10 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26656.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26656/head:pull/26656 PR: https://git.openjdk.org/jdk/pull/26656 From duke at openjdk.org Wed Aug 6 14:06:21 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 14:06:21 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> Message-ID: On Wed, 6 Aug 2025 12:15:57 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8364819: Fixed brace in the test > > src/hotspot/share/utilities/vmError.cpp line 108: > >> 106: const intptr_t VMError::segfault_address = pd_segfault_address; >> 107: volatile Thread* VMError::_handshake_timed_out_thread = nullptr; >> 108: volatile Thread* VMError::_safepoint_timed_out_thread = nullptr; > > I should have picked this up previously but `volatile` generally serves no purpose for inter-thread consistency. If we need to guarantee visibility between the signaller and signalee, then we need a fence to ensure that. Or we can skip the fence (and volatile) and note that this will usually, but perhaps not always, work fine. (I note that `pthread_kill` is not specified as a memory synchronizing operation, but I strongly suspect it has to have its own internal memory barriers that we would be piggy-backing on.) Thanks @dholmes-ora and @shipilev, I addressed the possible issue with atomic commands. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2257292348 From shade at openjdk.org Wed Aug 6 14:14:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Aug 2025 14:14:09 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v3] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Wed, 6 Aug 2025 14:06:19 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Made handling of timed out thread safe Comments... src/hotspot/share/utilities/vmError.hpp line 229: > 227: static void set_safepoint_timed_out_thread(Thread* thread); > 228: static volatile Thread* get_handshake_timed_out_thread(); > 229: static volatile Thread* get_safepoint_timed_out_thread(); `static volatile` for method declarations makes no real sense? Just `static` would suffice. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3092817846 PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2257319054 From duke at openjdk.org Wed Aug 6 14:21:04 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 14:21:04 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v3] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Wed, 6 Aug 2025 14:11:22 GMT, Aleksey Shipilev wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8364819: Made handling of timed out thread safe > > src/hotspot/share/utilities/vmError.hpp line 229: > >> 227: static void set_safepoint_timed_out_thread(Thread* thread); >> 228: static volatile Thread* get_handshake_timed_out_thread(); >> 229: static volatile Thread* get_safepoint_timed_out_thread(); > > `static volatile` for method declarations makes no real sense? Just `static` would suffice. If the variable is declared volatile, the same should be done with the getter, otherwise GCC complains about conversion: error: invalid conversion from 'volatile Thread*' to 'Thread*' ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2257345662 From kvn at openjdk.org Wed Aug 6 15:04:14 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 6 Aug 2025 15:04:14 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v5] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 5 Aug 2025 18:08:22 GMT, Ashutosh Mehra wrote: >> This PR implements the code for cpu features names string using ~FormatBuffer~ stringStream. ~It also improves and extends FormatBuffer with an additional method that appends comma separated strings to the buffer.~ It also adds a method to stringStream that that appends separated strings separated by a delimiter to the buffer. This is useful in creating the cpu features names string. >> This code will also be useful in Leyden to implement cpu feature check for AOTCodeCache [0]. >> Platforms affected: x86-64 and aarch64 >> Other platforms can be done if and when Leyden changes are ported to them. >> >> [0] https://github.com/openjdk/leyden/pull/84 > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments > > Signed-off-by: Ashutosh Mehra I submitted our testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26515#issuecomment-3160539796 From kvn at openjdk.org Wed Aug 6 15:19:08 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 6 Aug 2025 15:19:08 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache [v2] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 08:50:57 GMT, Andrew Dinn wrote: >> This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > minimize changed lines and align log messages Looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26642#pullrequestreview-3093156997 From shade at openjdk.org Wed Aug 6 15:04:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Aug 2025 15:04:04 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v3] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Wed, 6 Aug 2025 14:18:48 GMT, Anton Artemov wrote: >> src/hotspot/share/utilities/vmError.hpp line 229: >> >>> 227: static void set_safepoint_timed_out_thread(Thread* thread); >>> 228: static volatile Thread* get_handshake_timed_out_thread(); >>> 229: static volatile Thread* get_safepoint_timed_out_thread(); >> >> `static volatile` for method declarations makes no real sense? Just `static` would suffice. > > If the variable is declared volatile, the same should be done with the getter, otherwise GCC complains about conversion: > > error: invalid conversion from 'volatile Thread*' to 'Thread*' Oh, that's because you probably want the: static Thread* volatile _handshake_timed_out_thread; ...not: static volatile Thread* _handshake_timed_out_thread; I.e. the volatility is about the _field_, not about the _type_. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2257487900 From duke at openjdk.org Wed Aug 6 15:20:44 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 15:20:44 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v4] In-Reply-To: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: <39aBWZQQdrwY0PTHhASEW1vELD_g4OMDAJ62mUGObj4=.82f28cb9-83d3-4511-b081-1e998418dd5b@github.com> > Hi, please consider the following changes: > > This is a cleanup after https://github.com/openjdk/jdk/pull/26309 > > 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. > > 2) Added a missed brace in the error message. > > Trivial change. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8364819: Fixed volatility issue on getters. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26656/files - new: https://git.openjdk.org/jdk/pull/26656/files/594654e4..1cac9daa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=02-03 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/26656.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26656/head:pull/26656 PR: https://git.openjdk.org/jdk/pull/26656 From duke at openjdk.org Wed Aug 6 15:20:44 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 6 Aug 2025 15:20:44 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v3] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Wed, 6 Aug 2025 15:01:23 GMT, Aleksey Shipilev wrote: >> If the variable is declared volatile, the same should be done with the getter, otherwise GCC complains about conversion: >> >> error: invalid conversion from 'volatile Thread*' to 'Thread*' > > Oh, that's because you probably want the: > > > static Thread* volatile _handshake_timed_out_thread; > > > ...not: > > > static volatile Thread* _handshake_timed_out_thread; > > > I.e. the volatility is about the _field_, not about the _type_. Right, noted! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2257530740 From shade at openjdk.org Wed Aug 6 18:19:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Aug 2025 18:19:17 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v4] In-Reply-To: <39aBWZQQdrwY0PTHhASEW1vELD_g4OMDAJ62mUGObj4=.82f28cb9-83d3-4511-b081-1e998418dd5b@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <39aBWZQQdrwY0PTHhASEW1vELD_g4OMDAJ62mUGObj4=.82f28cb9-83d3-4511-b081-1e998418dd5b@github.com> Message-ID: On Wed, 6 Aug 2025 15:20:44 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Fixed volatility issue on getters. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3093723210 From pchilanomate at openjdk.org Wed Aug 6 18:22:54 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 6 Aug 2025 18:22:54 GMT Subject: RFR: 8359222: [asan] jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest triggers stack-buffer-overflow error Message-ID: Please review the following small change. On linux-aarch64, ASan reports an overflow error on a read that occurs when frames are copied from the stack to the heap?specifically during preemption on monitorenter and Object.wait when coming from compiled code. This read is benign and comes from reading either all (freeze fast path case) or part (freeze slow path case) of the `frame::metadata_words_at_bottom` stored in the callee frame (return pc+fp) for the top frame. In these preemption cases the callee frame is the frame of the VM native method being called (e.g. `Runtime1::monitorenter`). Now, the Arm 64-bit ABI doesn't specify a location where the frame record (returnpc+fp) has to be stored within a stack frame and GCC currently chooses to save it at the top of the frame (lowest address). So this causes ASan to treat the memory access in the callee as an overflow access to one of the locals stored in that frame. Accessing the callee to read `frame::metadata_words_at_bottom` is only necessary when the top frame is compiled. In that case, the callee is `Continuation.doYield` and those words contain the return pc and fp of the top frame we are freezing, where possibly fp might contain an oop. For the preemption cases mentioned above though, the return pc used for the top frame is always the one from the anchor, not the stack. Also, there is never a callee saved value in fp that needs to be preserved, the fp is always constructed again when thawing the frame. But because of sharing the code path with the compiled frame case we also read these words. I?ve added the full details on where these memory accesses occur in the freeze code, both slow and fast paths, in the bug comments. The fix is to adjust these shared paths to avoid the read in the preemption case. I was able to reproduce the issue and have verified it is now fixed. I also ran the patch through mach5 tiers1-7. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/26660/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26660&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359222 Stats: 32 lines in 4 files changed: 20 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/26660.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26660/head:pull/26660 PR: https://git.openjdk.org/jdk/pull/26660 From rriggs at openjdk.org Wed Aug 6 18:52:16 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Wed, 6 Aug 2025 18:52:16 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: Message-ID: <0_Cs4qRmBwmfZE8qrUNZaRCtUfWqarRBdrzZjGEZvLM=.9a0b66f9-9a12-4296-9c90-304080e72886@github.com> On Tue, 5 Aug 2025 16:04:08 GMT, Chen Liang wrote: >> Provide a general facility for our null check APIs like Objects::requireNonNull or future Checks::nullCheck (void), converting the existing infrastructure to start tracking from a given stack site (depth offset) and a given stack slot (offset value). > > Chen Liang has updated the pull request incrementally with one additional commit since the last revision: > > Update NPE per roger review A hotspot and 2 core-libs reviewers should have a chance to comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26600#issuecomment-3161228199 From rriggs at openjdk.org Wed Aug 6 18:52:17 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Wed, 6 Aug 2025 18:52:17 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> Message-ID: On Tue, 5 Aug 2025 15:01:49 GMT, Chen Liang wrote: >> Don't add APIs without uses; it speculative and the API may not be what is really needed. > > Should I open a dependent PR to prove that this is exactly what we need to unblock [Objects::requireNonNull message enhancements](https://bugs.openjdk.org/browse/JDK-8233268)? Please update the description to note this is a pre-requsite for: [JDK-8233268](https://bugs.openjdk.org/browse/JDK-8233268) Improve integration of Objects.requireNonNull and JEP 358 Helpful NPE ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2257996201 From amenkov at openjdk.org Wed Aug 6 19:12:31 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 6 Aug 2025 19:12:31 GMT Subject: RFR: 8361103: java_lang_Thread::async_get_stack_trace does not properly protect JavaThread [v8] In-Reply-To: References: Message-ID: <733zJmz4QYLOKTZ5lU1rp13UvWeJaQ7fesb5cBmLFr8=.8a9f7720-fec8-4c84-8cd7-7b04c6b74dd2@github.com> > The fix updates `java_lang_Thread::async_get_stack_trace()` (used by `java.lang.Thread.getStackTrace()` to get stack trace for platform and mounted virtual threads) to correctly use `ThreadListHandle` for thread protection. > > Testing: tier1..5 Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/classfile/javaClasses.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26119/files - new: https://git.openjdk.org/jdk/pull/26119/files/811db312..3cc69d8f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26119&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26119&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26119.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26119/head:pull/26119 PR: https://git.openjdk.org/jdk/pull/26119 From dholmes at openjdk.org Wed Aug 6 21:06:16 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Aug 2025 21:06:16 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v4] In-Reply-To: <39aBWZQQdrwY0PTHhASEW1vELD_g4OMDAJ62mUGObj4=.82f28cb9-83d3-4511-b081-1e998418dd5b@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <39aBWZQQdrwY0PTHhASEW1vELD_g4OMDAJ62mUGObj4=.82f28cb9-83d3-4511-b081-1e998418dd5b@github.com> Message-ID: On Wed, 6 Aug 2025 15:20:44 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Fixed volatility issue on getters. Changes requested by dholmes (Reviewer). src/hotspot/share/utilities/vmError.cpp line 108: > 106: const intptr_t VMError::segfault_address = pd_segfault_address; > 107: Thread* volatile VMError::_handshake_timed_out_thread = nullptr; > 108: Thread* volatile VMError::_safepoint_timed_out_thread = nullptr; `volatile` is still not useful or needed here. src/hotspot/share/utilities/vmError.cpp line 1346: > 1344: > 1345: void VMError::set_safepoint_timed_out_thread(Thread* thread) { > 1346: Atomic::release_store(&_safepoint_timed_out_thread, thread); Acquire/release is not what is needed here. You are not trying to synchronize with previous memory accesses. If you want to force this to be visible to the other thread a `fence()` is required. ------------- PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3094270934 PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2258292221 PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2258295173 From dlong at openjdk.org Thu Aug 7 01:14:25 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Aug 2025 01:14:25 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v3] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 13:19:33 GMT, Manuel H?ssig wrote: >> This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. >> >> The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. >> >> Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. >> >> Testing: >> - [ ] Github Actions >> - [ ] tier1, tier2 on all platforms >> - [ ] tier3, tier4 and Oracle internal testing on Linux fastdebug >> - [ ] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) > > Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into JDK-8308094-timeout > - No acquire release semantics > - Factor Linux specific timeout functionality out of share/ > - Move timeout disarm above if > - Merge branch 'master' into JDK-8308094-timeout > - Fix SIGALRM test > - Add timeout functionality to compiler threads Looks good. However, you might want to simply remove _timeout_armed, or put it inside a #ifdef ASSERT, since it is only used in an assert. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26023#pullrequestreview-3094752418 From dlong at openjdk.org Thu Aug 7 02:00:13 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Aug 2025 02:00:13 GMT Subject: RFR: 8360219: [AIX] assert(locals_base >= l2) failed: bad placement In-Reply-To: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> References: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> Message-ID: On Tue, 5 Aug 2025 14:53:30 GMT, Richard Reingruber wrote: > Weaken assertion because it is too strict. While the interpreted caller sometimes has a `frame::top_ijava_frame_abi` it is sufficient to assert that the locals don't overlap with the smaller `frame::parent_ijava_frame_abi` because only that's reserved for non-top frames (aka `parent` frames). > > Tested on AIX and Linux ppc: Tier 1-4 of hotspot and jdk. All of langtools and jaxp. Renaissance Suite and SAP specific tests. > > Details: > > It cannot be assumed that the interpreted caller of the bottom interpreted frame (from a compiled deoptee frame) has a large `frame::top_ijava_frame_abi`. In an ordinary i2c call it would keep its `frame::top_ijava_frame_abi` (see [`call_from_interpreter`](https://github.com/openjdk/jdk/blob/67ba8b45dd632c40d5e6872d2a6ce24f86c22152/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp#L1217)) but when it was thawed then it'll have only a `frame::java_abi` (alias for parent_ijava_frame_abi). > > There are [diagrams](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L395) commenting `ThawBase::new_stack_frame()` that show this. > > It's not easy to see it in the code though. Note that the [frame size](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L496) is calculated relative to hf's unextended_sp which [includes frame::metadata_words](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/stackChunkFrameStream_ppc.inline.hpp#L80) which is the size of `java_abi`. It seems like relaxing the assert would allow us to silently overwrite part of a frame::top_ijava_frame_abi, which is only harmless if we are guaranteed to always treat that area as the smaller "parent" ABI past this point. Is there any way to determine if the ABI frame flavor is "top" or "parent" without adding something like a new "abi_frame_type" slot? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26643#issuecomment-3162163970 From kvn at openjdk.org Thu Aug 7 05:34:16 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 7 Aug 2025 05:34:16 GMT Subject: RFR: 8364128: Improve gathering of cpu feature names using stringStream [v5] In-Reply-To: References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Tue, 5 Aug 2025 18:08:22 GMT, Ashutosh Mehra wrote: >> This PR implements the code for cpu features names string using ~FormatBuffer~ stringStream. ~It also improves and extends FormatBuffer with an additional method that appends comma separated strings to the buffer.~ It also adds a method to stringStream that that appends separated strings separated by a delimiter to the buffer. This is useful in creating the cpu features names string. >> This code will also be useful in Leyden to implement cpu feature check for AOTCodeCache [0]. >> Platforms affected: x86-64 and aarch64 >> Other platforms can be done if and when Leyden changes are ported to them. >> >> [0] https://github.com/openjdk/leyden/pull/84 > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments > > Signed-off-by: Ashutosh Mehra My testing passed. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26515#pullrequestreview-3095366477 From jsjolen at openjdk.org Thu Aug 7 07:15:15 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 7 Aug 2025 07:15:15 GMT Subject: RFR: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree [v2] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 11:07:42 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter time complexity, the treap can be removed in favour of it. >> >> I made some modifications to the red-black tree to make it compatible with previous treap usages: >> - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. >> - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. >> >> Changes to NMT include: >> - Modified components to align with the updated const-correctness of the red-black tree functions >> - Renamed structures and variables to remove "treap" from their names to reflect the new tree >> >> The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. >> >> Testing: >> - Oracle tiers 1-3 > > Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: > > feedback fixes Still OK ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26655#pullrequestreview-3095690422 From duke at openjdk.org Thu Aug 7 07:34:50 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 7 Aug 2025 07:34:50 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v5] In-Reply-To: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: > Hi, please consider the following changes: > > This is a cleanup after https://github.com/openjdk/jdk/pull/26309 > > 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. > > 2) Added a missed brace in the error message. > > Trivial change. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8364819: Addressed reviewer's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26656/files - new: https://git.openjdk.org/jdk/pull/26656/files/1cac9daa..094eb14a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26656&range=03-04 Stats: 10 lines in 2 files changed: 2 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/26656.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26656/head:pull/26656 PR: https://git.openjdk.org/jdk/pull/26656 From duke at openjdk.org Thu Aug 7 07:34:51 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 7 Aug 2025 07:34:51 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v4] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <39aBWZQQdrwY0PTHhASEW1vELD_g4OMDAJ62mUGObj4=.82f28cb9-83d3-4511-b081-1e998418dd5b@github.com> Message-ID: On Wed, 6 Aug 2025 21:02:16 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8364819: Fixed volatility issue on getters. > > src/hotspot/share/utilities/vmError.cpp line 108: > >> 106: const intptr_t VMError::segfault_address = pd_segfault_address; >> 107: Thread* volatile VMError::_handshake_timed_out_thread = nullptr; >> 108: Thread* volatile VMError::_safepoint_timed_out_thread = nullptr; > > `volatile` is still not useful or needed here. Removed. > src/hotspot/share/utilities/vmError.cpp line 1346: > >> 1344: >> 1345: void VMError::set_safepoint_timed_out_thread(Thread* thread) { >> 1346: Atomic::release_store(&_safepoint_timed_out_thread, thread); > > Acquire/release is not what is needed here. You are not trying to synchronize with previous memory accesses. If you want to force this to be visible to the other thread a `fence()` is required. Yes, that is what we want, thanks. Addressed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2259389904 PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2259389450 From dholmes at openjdk.org Thu Aug 7 08:31:19 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Aug 2025 08:31:19 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: On Wed, 6 Aug 2025 12:59:01 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/utilities/vmError.cpp line 108: >> >>> 106: const intptr_t VMError::segfault_address = pd_segfault_address; >>> 107: volatile Thread* VMError::_handshake_timed_out_thread = nullptr; >>> 108: volatile Thread* VMError::_safepoint_timed_out_thread = nullptr; >> >> I should have picked this up previously but `volatile` generally serves no purpose for inter-thread consistency. If we need to guarantee visibility between the signaller and signalee, then we need a fence to ensure that. Or we can skip the fence (and volatile) and note that this will usually, but perhaps not always, work fine. (I note that `pthread_kill` is not specified as a memory synchronizing operation, but I strongly suspect it has to have its own internal memory barriers that we would be piggy-backing on.) > > I am pretty sure the memory consistency here is irrelevant, given that: a) the `Thread` in question is likely already initialized for a long time, so it is unlikely we will lose anything release-acquire-wise; b) we only use these for pointer comparisons. > > But since we are here, and in the interest of clarity and avoiding future surprises, we can just summarily wrap these with `Atomic::release_store` and `Atomic::load_acquire` to be extra paranoidly safe. This is a failing path, so we don't care about performance, and would like to avoid a secondary crash in error handler some time in the future, if anyone reads anything from these threads. @shipilev The memory consistency is about seeing the writes to these fields in the thread that is signalled. acquire/release has no meaning in this context. I would not add acquire/release just in case someone in the future actually tried to follow those pointers - they are only for identifying purposes. If you are worried about that then lets go back to changing them to a value (intptr_t) so it will never be an issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2259542119 From rrich at openjdk.org Thu Aug 7 08:38:16 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 7 Aug 2025 08:38:16 GMT Subject: RFR: 8360219: [AIX] assert(locals_base >= l2) failed: bad placement In-Reply-To: References: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> Message-ID: On Thu, 7 Aug 2025 01:57:22 GMT, Dean Long wrote: > It seems like relaxing the assert would allow us to silently overwrite part of a frame::top_ijava_frame_abi, which is only harmless if we are guaranteed to always treat that area as the smaller "parent" ABI past this point. It is harmless. The interpreted and nmethod calling conventions only use the smaller frame::java_abi (alias for parent_ijava_frame_abi). Calling the VM or other native code requires the full ABI. > Is there any way to determine if the ABI frame flavor is "top" or "parent" without adding something like a new "abi_frame_type" slot? I would not be aware of any way to determine the ABI type easily. I'm sure many people were looking for it since bugs caused by using the wrong type are painful. Even with a dedicated slot in dbg builds it would be hard to keep the correct type since there are many places where frames get resized. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26643#issuecomment-3163100058 From dlong at openjdk.org Thu Aug 7 08:47:19 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Aug 2025 08:47:19 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v5] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 02:54:22 GMT, Kim Barrett wrote: >> Thanks for implementing nice code for PPC64! I appreciate it! The shared code and the other platforms look fine, too. >> Maybe atomic bitwise operations could be used, but I'm happy with your current solution. > >> Thanks @TheRealMDoerr . I didn't even consider atomic bitwise operations, but that's a good idea. I'm not in a hurry to push this, so if you could provide an atomic bitwise patch for ppc64, I would be happy to include it. In the mean time, I'm still investigating the ZGC regression. If I can figure it out, I might want to include a fix for ZGC in this PR as well. > > Not a review, just a drive-by comment. > We've had Atomic bitops for a while now. > Atomic::fetch_then_{and,or,xor}(ptr, bits [, order]) > Atomic::{and,or,xor}_then_fetch(ptr, bits [, order]) > They haven't been optimized for most (any?) platforms, being based on cmpxchg. > (See all the "Specialize atomic bitset functions for ..." related to > https://bugs.openjdk.org/browse/JDK-8293117.) Thanks @kimbarrett. I see that the Atomic::fetch_then_XXX() implementation is very similar to what I came up with. The operation I'm doing is sometimes setting a single bit, so fetch_then_or() could be used, but sometimes the operation is setting the other 31 bits, so a new fetch_then_set_with_mask() would need to be added. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26399#issuecomment-3163131928 From dlong at openjdk.org Thu Aug 7 08:51:13 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Aug 2025 08:51:13 GMT Subject: RFR: 8360219: [AIX] assert(locals_base >= l2) failed: bad placement In-Reply-To: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> References: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> Message-ID: <57AXuIteOPIZfG8MYQd1_Vp7UPO10wXDoqC8f7lzlYw=.81d965eb-fb7a-4c13-8e2a-10f840b3e8ee@github.com> On Tue, 5 Aug 2025 14:53:30 GMT, Richard Reingruber wrote: > Weaken assertion because it is too strict. While the interpreted caller sometimes has a `frame::top_ijava_frame_abi` it is sufficient to assert that the locals don't overlap with the smaller `frame::parent_ijava_frame_abi` because only that's reserved for non-top frames (aka `parent` frames). > > Tested on AIX and Linux ppc: Tier 1-4 of hotspot and jdk. All of langtools and jaxp. Renaissance Suite and SAP specific tests. > > Details: > > It cannot be assumed that the interpreted caller of the bottom interpreted frame (from a compiled deoptee frame) has a large `frame::top_ijava_frame_abi`. In an ordinary i2c call it would keep its `frame::top_ijava_frame_abi` (see [`call_from_interpreter`](https://github.com/openjdk/jdk/blob/67ba8b45dd632c40d5e6872d2a6ce24f86c22152/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp#L1217)) but when it was thawed then it'll have only a `frame::java_abi` (alias for parent_ijava_frame_abi). > > There are [diagrams](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L395) commenting `ThawBase::new_stack_frame()` that show this. > > It's not easy to see it in the code though. Note that the [frame size](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L496) is calculated relative to hf's unextended_sp which [includes frame::metadata_words](https://github.com/openjdk/jdk/blob/8a571ee7f2d9a46ff485fd9f3658c552e2d20817/src/hotspot/cpu/ppc/stackChunkFrameStream_ppc.inline.hpp#L80) which is the size of `java_abi`. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26643#pullrequestreview-3096083318 From dlong at openjdk.org Thu Aug 7 08:51:14 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Aug 2025 08:51:14 GMT Subject: RFR: 8360219: [AIX] assert(locals_base >= l2) failed: bad placement In-Reply-To: References: <6YAyMkyKnD_ipbw7yZMKciLoh5zEsiLeT78Pv4Crvvs=.b765ce7c-d210-4cdf-9047-6e4b18a81c98@github.com> Message-ID: On Thu, 7 Aug 2025 08:35:33 GMT, Richard Reingruber wrote: >> It seems like relaxing the assert would allow us to silently overwrite part of a frame::top_ijava_frame_abi, which is only harmless if we are guaranteed to always treat that area as the smaller "parent" ABI past this point. Is there any way to determine if the ABI frame flavor is "top" or "parent" without adding something like a new "abi_frame_type" slot? > >> It seems like relaxing the assert would allow us to silently overwrite part of a frame::top_ijava_frame_abi, which is only harmless if we are guaranteed to always treat that area as the smaller "parent" ABI past this point. > > It is harmless. The interpreted and nmethod calling conventions only use the smaller frame::java_abi (alias for parent_ijava_frame_abi). Calling the VM or other native code requires the full ABI. > >> Is there any way to determine if the ABI frame flavor is "top" or "parent" without adding something like a new "abi_frame_type" slot? > > I would not be aware of any way to determine the ABI type easily. I'm sure many people were looking for it since bugs caused by using the wrong type are painful. Even with a dedicated slot in dbg builds it would be hard to keep the correct type since there are many places where frames get resized. Thanks @reinrich . It looks good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26643#issuecomment-3163150172 From duke at openjdk.org Thu Aug 7 09:49:12 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Thu, 7 Aug 2025 09:49:12 GMT Subject: RFR: 8364638: Refactor and make accumulated GC CPU time code generic [v4] In-Reply-To: References: Message-ID: > Hi all, > > This PR refactors the newly added GC CPU time code from [JDK-8359110](https://bugs.openjdk.org/browse/JDK-8359110). > > As a stepping-stone to enable consolidation of CPU time tracking in e.g. hsperf counters and GCTraceCPUTime and to have a unified interface for tracking CPU time of various components in Hotspot this code can be refactored. This PR introduces a new interface to retrieve CPU time for various Hotspot components and it currently supports: > > CPUTimeUsage::GC::total() // the sum of gc_threads(), vm_thread(), stringdedup() > > CPUTimeUsage::GC::gc_threads() > CPUTimeUsage::GC::vm_thread() > CPUTimeUsage::GC::stringdedup() > > CPUTimeUsage::Runtime::vm_thread() > > > I moved `CPUTimeUsage` to `src/hotspot/share/services` since it seemed fitting as it housed similar performance tracking code like `RuntimeService`, as this is no longer a class that is only specific to GC. > > I also made a minor improvement in the CPU time logging during exit. Since `CPUTimeUsage` supports more components than just GC I changed the logging flag to from `gc,cpu` to `cpu` and created a detailed table: > > > [12.517s][info][cpu] === CPU time Statistics ============================================================= > [12.517s][info][cpu] CPUs > [12.517s][info][cpu] s % utilized > [12.517s][info][cpu] Process > [12.517s][info][cpu] Total 175.7628 100.00 14.0 > [12.517s][info][cpu] VM Thread 7.0000 3.98 0.6 > [12.517s][info][cpu] Garbage Collection 72.0000 40.96 5.8 > [12.517s][info][cpu] GC Threads 70.0000 39.83 5.6 > [12.517s][info][cpu] VM Thread 1.0000 0.57 0.1 > [12.518s][info][cpu] String Deduplication 0.0000 0.00 0.0 > [12.518s][info][cpu] ===================================================================================== > > > Additionally, if CPU time retrieval fails it should not be the caller's responsibility to log warnings as this would bloat the code unnecessarily. I've noticed that `os` does log a warning for some methods if they fail so I continued on this path. Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: Improve robustness ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26621/files - new: https://git.openjdk.org/jdk/pull/26621/files/216ba811..3f552362 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26621&range=02-03 Stats: 29 lines in 3 files changed: 23 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/26621.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26621/head:pull/26621 PR: https://git.openjdk.org/jdk/pull/26621 From duke at openjdk.org Thu Aug 7 09:53:00 2025 From: duke at openjdk.org (Yuri Gaevsky) Date: Thu, 7 Aug 2025 09:53:00 GMT Subject: RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic [v3] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported. > > Thank you, > -Yuri Gaevsky > > **Correctness checks:** > hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4. Yuri Gaevsky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge master - Merge master - 8324124: RISC-V: implement _vectorizedMismatch intrinsic ------------- Changes: https://git.openjdk.org/jdk/pull/17750/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17750&range=02 Stats: 163 lines in 5 files changed: 162 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17750.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17750/head:pull/17750 PR: https://git.openjdk.org/jdk/pull/17750 From shade at openjdk.org Thu Aug 7 10:51:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 10:51:14 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: On Thu, 7 Aug 2025 08:28:40 GMT, David Holmes wrote: >> I am pretty sure the memory consistency here is irrelevant, given that: a) the `Thread` in question is likely already initialized for a long time, so it is unlikely we will lose anything release-acquire-wise; b) we only use these for pointer comparisons. >> >> But since we are here, and in the interest of clarity and avoiding future surprises, we can just summarily wrap these with `Atomic::release_store` and `Atomic::load_acquire` to be extra paranoidly safe. This is a failing path, so we don't care about performance, and would like to avoid a secondary crash in error handler some time in the future, if anyone reads anything from these threads. > > @shipilev The memory consistency is about seeing the writes to these fields in the thread that is signalled. acquire/release has no meaning in this context. I would not add acquire/release just in case someone in the future actually tried to follow those pointers - they are only for identifying purposes. If you are worried about that then lets go back to changing them to a value (intptr_t) so it will never be an issue. It always feels dubious to me to add fences for the single-variable "visibility" reasons. The fences order things relative to other things, not just "flush" or "make this single variable visible". But that's a theoretical quibble. A more pressing problem is reading the thread marker without any acquire/relaxed semantics; _that_ might miss updates as well. Given how messy C++ memory model is, and how data races are UB-scary there, I think we should strive to do `Atomic`-s as the matter of course for anything that transfers data between threads. Adjusting `Atomic::release_store` -> `Atomic::release_store_fence` would have satisfied both flavors of concurrency paranoia we are having, I think: it is equivalent to having a fence after the store, and it clearly makes Atomic release->acquire chain as well :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2259902716 From fgao at openjdk.org Thu Aug 7 11:46:13 2025 From: fgao at openjdk.org (Fei Gao) Date: Thu, 7 Aug 2025 11:46:13 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> References: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> Message-ID: On Wed, 6 Aug 2025 09:59:11 GMT, Andrew Haley wrote: >> If the relocation or target address is guaranteed to reside within the CodeCache, we can safely replace a `movz + movk + movk` sequence with a more compact and efficient `adrp + add` instruction pair. >> >> In `MacroAssembler::mov(Register r, Address dest)`, this replacement can be applied if any of the following rules hold: >> >> 1. The relocation type indicates that the address resides within the CodeCache and the necessary patching logic is provided in `fix_relocation_after_move()`. >> 2. The target address is fixed (i.e., does not require relocation) and is within the reachable range for `adrp`. >> >> The patch performs the filtering in `is_relocated_within_codecache()` and `is_adrp_reachable()` to ensure this optimization is applied safely and selectively. > > It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs. Thanks for your review @theRealAph @adinn . > It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs. @theRealAph That makes a lot of sense. On Neoverse N1/N2/V1, `ADRP` and `MOVZ/MOVK` have comparable latency and throughput and share the same pipeline, so we can expect these microarchitectures to benefit from this patch. I?m planning to update the patch with an `aarch64` option to enable or disable this replacement based on the target microarchitecture. What do you think? > The solution is to make the compiler always generate the 3-instruction load when compiling in an Assembly VM and otherwise generate the 2-instruction load based on reachability i.e. AOT code won't 'benefit' from this patch but runtime generated code will (assuming it is a benefit). > So, the reachability method needs to return false if `AOTCodeCache::is_on_for_dump() returns true. @adinn thanks for pointing this out. I'll add this constraint in next commit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3163784211 From dholmes at openjdk.org Thu Aug 7 12:12:16 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Aug 2025 12:12:16 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v5] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Thu, 7 Aug 2025 07:34:50 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Addressed reviewer's comments The updates look fine to me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3096859282 From dholmes at openjdk.org Thu Aug 7 12:09:16 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Aug 2025 12:09:16 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: On Thu, 7 Aug 2025 10:49:06 GMT, Aleksey Shipilev wrote: >> @shipilev The memory consistency is about seeing the writes to these fields in the thread that is signalled. acquire/release has no meaning in this context. I would not add acquire/release just in case someone in the future actually tried to follow those pointers - they are only for identifying purposes. If you are worried about that then lets go back to changing them to a value (intptr_t) so it will never be an issue. > > It always feels dubious to me to add fences for the single-variable "visibility" reasons. The fences order things relative to other things, not just "flush" or "make this single variable visible". But that's a theoretical quibble. A more pressing problem is reading the thread marker without any acquire/relaxed semantics; _that_ might miss updates as well. > > Given how messy C++ memory model is, and how data races are UB-scary there, I think we should strive to do `Atomic`-s as the matter of course for anything that transfers data between threads. Adjusting `Atomic::release_store` -> `Atomic::release_store_fence` would have satisfied both flavors of concurrency paranoia we are having, I think: it is equivalent to having a fence after the store, and it clearly makes Atomic release->acquire chain as well :) > that might miss updates as well. Might miss updates to what??? We never follow the pointer. The signalling thread doesn't doesn't do anything that the signallee has to look at, the signallee just gets the signal, checks if it was the timeout target and aborts. We use fence() for "single variable visibility reasons" in a number of places because it is the only thing that actually forces memory synchronization across all threads. acquire/release just says "if you see this then you must see that", it doesn't do anything to help you "see this" in the first place. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2260118952 From shade at openjdk.org Thu Aug 7 12:37:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 12:37:16 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: On Thu, 7 Aug 2025 12:06:36 GMT, David Holmes wrote: > Might miss updates to what??? We never follow the pointer. Following the pointer is not relevant. We can miss updates to the thread pointer. Note that reader needs at least coherence to break out of potential loops like: while (get_handshake_thread() == nullptr); // burn Acquire gives that, among other things. Overlooking these properties is easy, and this is why we should be sticking to use atomics anywhere we transfer data between threads. > We use fence() for "single variable visibility reasons" in a number of places because it is the only thing that actually forces memory synchronization across all threads. Well, yes, this part always felt dubious to me. I am restraining myself from going into theoretical argument here, because it would take a lot of time, and would not be helpful for this work. Since no performance is on the table, we should just do the most obvious/safe thing. I believe it is a proper style to use `Atomic::load_acquire` and `Atomic::release_store_fence`, instead of one-sided `fence`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2260184827 From jbhateja at openjdk.org Thu Aug 7 12:37:17 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 7 Aug 2025 12:37:17 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v2] In-Reply-To: References: Message-ID: <98s1CQXUZuyBYN83myqlz01lNsEw3o7-v1DdVb3cNv4=.705802ff-2a68-4258-8f2b-fe5885ce32c5@github.com> On Fri, 25 Jul 2025 20:09:40 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Updating predicate checks Adding additional notes on implementation: A) Slice:- 1. New inline expander and C2 IR node VectorSlice for leaf level intrinsic corresponding to Vector.slice(int) 2. Other interfaces of slice APIs. - Vector.slice(int, Vector) - The second vector argument is the background vector, which replaces the zero broadcasted vector of the base version of API. - API internally calls the same intrinsic entry point as the base version. - Vector.slice(int, Vector, VectorMask) - This version of the API internally calls the above slice API with index and vector arguments, followed by an explicit blend with a broadcasted zero vector. Thus, current support implicitly covers all three 3 variants of slice APIs. B) Similar extensions to optimize Unslice with constant index:- 1. Similar to slice, unslice also has three interfaces. 2. Leaf-level interface only accepts an index argument. 3. Other variants of unslice accept unslice index, background vector, and part number. 4. We can assume the receiver vector to be sliding over two contiguously placed background vectors. 5. It's possible to implement all three variants of unslice using slice operations as follows. jshell> // Input jshell> vec vec ==> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] jshell> vec2 vec2 ==> [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160] jshell> bzvec bzvec ==> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] jshell> // Case 1: jshell> vec.unslice(4) $79 ==> [0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] jshell> bzvec.slice(vec.length() - 4, vec) $80 ==> [0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] jshell> // Case 2: jshell> vec.unslice(4, vec2, 0) $81 ==> [10, 20, 30, 40, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] jshell> vec2.blend(vec2.slice(vec2.length() - 4, vec), VectorMask.fromLong(IntVector.SPECIES_512, ((1L << (vec.length() - 4)) - 1) << 4)) $82 ==> [10, 20, 30, 40, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] jshell> // Case 3: jshell> vec.unslice(4, vec2, 1) $83 ==> [13, 14, 15, 16, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160] jshell> vec2.blend(vec.slice(vec.length() - 4, vec2), VectorMask.fromLong(IntVector.SPECIES_512, ((1L << 4) - 1))) $84 ==> [13, 14, 15, 16, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160] jshell> // Case 4: jshell> vec.unslice(4, vec2, 0, VectorMask.fromLong(IntVector.SPECIES_512, 0xFF)) $85 ==> [10, 20, 30, 40, 1, 2, 3, 4, 5, 6, 7, 8, 130, 140, 150, 160] jshell> // Current Java fallback implementation for this version is based on slice and unslice operations. To ease the review process, I plan to optimize the unslice API with a constant index by extending the newly added expander in a follow-up patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24104#issuecomment-3163994472 From shade at openjdk.org Thu Aug 7 12:56:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 12:56:17 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v5] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Thu, 7 Aug 2025 07:34:50 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Addressed reviewer's comments I still think these should be wrapped with `Atomic`-s, but I don't have energy to argue in favor of them anymore. So I would defer a second review to someone else. ------------- PR Review: https://git.openjdk.org/jdk/pull/26656#pullrequestreview-3097025636 From asmehra at openjdk.org Thu Aug 7 13:29:22 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 7 Aug 2025 13:29:22 GMT Subject: Integrated: 8364128: Improve gathering of cpu feature names using stringStream In-Reply-To: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> References: <0OB4WdhsTe7RDAYLe8tLHVn43gaUMa-NPXPi0RKoBHU=.33fa828d-ef24-4047-8080-9d033654d9a3@github.com> Message-ID: On Mon, 28 Jul 2025 18:16:16 GMT, Ashutosh Mehra wrote: > This PR implements the code for cpu features names string using ~FormatBuffer~ stringStream. ~It also improves and extends FormatBuffer with an additional method that appends comma separated strings to the buffer.~ It also adds a method to stringStream that that appends separated strings separated by a delimiter to the buffer. This is useful in creating the cpu features names string. > This code will also be useful in Leyden to implement cpu feature check for AOTCodeCache [0]. > Platforms affected: x86-64 and aarch64 > Other platforms can be done if and when Leyden changes are ported to them. > > [0] https://github.com/openjdk/leyden/pull/84 This pull request has now been integrated. Changeset: bc3d8656 Author: Ashutosh Mehra URL: https://git.openjdk.org/jdk/commit/bc3d86564042208cee5119abe11905e747a5ef4c Stats: 155 lines in 9 files changed: 66 ins; 25 del; 64 mod 8364128: Improve gathering of cpu feature names using stringStream Co-authored-by: Johan Sj?len Reviewed-by: kvn, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/26515 From adinn at openjdk.org Thu Aug 7 13:57:50 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 13:57:50 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v3] In-Reply-To: References: Message-ID: > The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: propagate BufferBlob subclass sizes up constructor chain ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26532/files - new: https://git.openjdk.org/jdk/pull/26532/files/ad4a1a27..a7925825 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26532&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26532&range=01-02 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/26532.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26532/head:pull/26532 PR: https://git.openjdk.org/jdk/pull/26532 From adinn at openjdk.org Thu Aug 7 13:57:52 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 13:57:52 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v2] In-Reply-To: References: Message-ID: On Tue, 29 Jul 2025 13:28:15 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into JDK-8364269 > - simplify code cache API by storing adapter entry offsets in blob @vnkozlov @ashu-mehra @shipilev Ok, problem fixed, I believe, but I'm still gob-smacked that this built, ran and passed tier1 plus cds/appcds tests on Linux/aarch64 given the failure to run the exploded image on Linux/x86 and MacOS/aarch64. The problem was with the blob layout. Adding extra fields to AdapterBlob requires propagating the adjusted blob size (i.e. header_size) up the Blob constructor chain, through BufferBlob into RuntimeBlob. Which I have now done -- previously, all subclasses were sized using sizeof(BufferBlob) as none of them had extra fields. without that fix the content_start address for the Adpater blob was being reported in the tail of the blob proper which meant that in the broken builds data fields were being treated as code and we end up in hyperspace when we first use an i2c adapter to jump to Object.init(). Yet not on Linux/aarch64? Go figure. Should be ready to review now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3136689701 From asmehra at openjdk.org Thu Aug 7 13:57:52 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 7 Aug 2025 13:57:52 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v2] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 08:34:53 GMT, Andrew Dinn wrote: > think maybe in follow-up RFE ;-) Yeah, I can take a stab at that. Meanwhile this PR is still in draft. Do you have any other changes to be made? Seems ready to me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3164204093 From asmehra at openjdk.org Thu Aug 7 13:57:52 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 7 Aug 2025 13:57:52 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v2] In-Reply-To: References: Message-ID: On Tue, 29 Jul 2025 13:28:15 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into JDK-8364269 > - simplify code cache API by storing adapter entry offsets in blob If the entry offsets are stored in `AdapterBlob`, `AdapterHandlerEntry` can then just store a pointer to the blob and get rid of the offsets. `SharedRuntime::generate_i2c2i_adapters` can be updated to return the AdapterBlob. And when storing the AdapterHandlerEntry, pointer to the blob can be cleared and restored in `AdapterHandlerLibrary::lookup_aot_cache`. This assumes the AdapterBlob won't be moved around in the code cache. @adinn does this make sense? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3137074976 From asmehra at openjdk.org Thu Aug 7 13:57:52 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 7 Aug 2025 13:57:52 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v2] In-Reply-To: References: Message-ID: <7rfxaYIUOqeaQMjXcu6ny194UhT9mFUgb8UyGmAW8Qo=.73b7c7b3-1417-469f-9f50-672c6bd36b40@github.com> On Thu, 31 Jul 2025 08:56:00 GMT, Andrew Dinn wrote: > So, in this one case we have a Frankenstein handler assembled from 3 disparate entry points (+ nullptr) which are not derived from an associated adapter blob. That's true. I missed this special adapter handler. That special handler is a pain to accommodate. I looked at how these entry points are accessed and it seems they are mostly accessed by the routines in `Method`. So it should be feasible to handle the case of abstract methods there and get rid of this special adapter completely. @adinn wdyt? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3140236584 From adinn at openjdk.org Thu Aug 7 13:57:52 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 13:57:52 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v2] In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 08:56:00 GMT, Andrew Dinn wrote: >> If the entry offsets are stored in `AdapterBlob`, `AdapterHandlerEntry` can then just store a pointer to the blob and get rid of the offsets. `SharedRuntime::generate_i2c2i_adapters` can be updated to return the AdapterBlob. And when storing the AdapterHandlerEntry, pointer to the blob can be cleared and restored in `AdapterHandlerLibrary::lookup_aot_cache`. >> This assumes the AdapterBlob won't be moved around in the code cache. >> @adinn does this make sense? > > @ashu-mehra >> If the entry offsets are stored in AdapterBlob, AdapterHandlerEntry can then just store a pointer to the blob and get rid of the offsets. > > I planned that originally but it doesn't work because of one special case. Method `AdapterHandlerLibrary::create_abstract_method_handler` does this: > > address wrong_method_abstract = SharedRuntime::get_handle_wrong_method_abstract_stub(); > _abstract_method_handler = AdapterHandlerLibrary::new_entry(AdapterFingerPrint::allocate(0, nullptr)); > _abstract_method_handler->set_entry_points(SharedRuntime::throw_AbstractMethodError_entry(), > wrong_method_abstract, > wrong_method_abstract, > nullptr); > > So, in this one case we have a Frankenstein handler assembled from 3 disparate entry points (+ nullptr) which are not derived from an associated adapter blob. > I looked at how these entry points are accessed and it seems they are mostly accessed by the routines in Method. So it should be feasible to handle the case of abstract methods there and get rid of this special adapter completely. @adinn wdyt? I think maybe in follow-up RFE ;-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3163097384 From adinn at openjdk.org Thu Aug 7 13:57:52 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 13:57:52 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v2] In-Reply-To: References: Message-ID: On Wed, 30 Jul 2025 16:36:13 GMT, Ashutosh Mehra wrote: >> Andrew Dinn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8364269 >> - simplify code cache API by storing adapter entry offsets in blob > > If the entry offsets are stored in `AdapterBlob`, `AdapterHandlerEntry` can then just store a pointer to the blob and get rid of the offsets. `SharedRuntime::generate_i2c2i_adapters` can be updated to return the AdapterBlob. And when storing the AdapterHandlerEntry, pointer to the blob can be cleared and restored in `AdapterHandlerLibrary::lookup_aot_cache`. > This assumes the AdapterBlob won't be moved around in the code cache. > @adinn does this make sense? @ashu-mehra > If the entry offsets are stored in AdapterBlob, AdapterHandlerEntry can then just store a pointer to the blob and get rid of the offsets. I planned that originally but it doesn't work because of one special case. Method `AdapterHandlerLibrary::create_abstract_method_handler` does this: address wrong_method_abstract = SharedRuntime::get_handle_wrong_method_abstract_stub(); _abstract_method_handler = AdapterHandlerLibrary::new_entry(AdapterFingerPrint::allocate(0, nullptr)); _abstract_method_handler->set_entry_points(SharedRuntime::throw_AbstractMethodError_entry(), wrong_method_abstract, wrong_method_abstract, nullptr); So, in this one case we have a Frankenstein handler assembled from 3 disparate entry points (+ nullptr) which are not derived from an associated adapter blob. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3139109309 From adinn at openjdk.org Thu Aug 7 14:01:24 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 14:01:24 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 23:39:59 GMT, Vladimir Kozlov wrote: >> @vnkozlov This is still intermittent as it depends on a race. However, I ran this repeatedly with the initial and reserved cache sizes that caused the problem and with stubs logging enabled and observed execution to continue in cases where the race would previously have caused a crash. Here is the outpout from such a run. Look for the new stubs log warning message in the following output: >> >> [adinn at clarapandy JDK-8364558]$ build/linux-aarch64-server-release/images/jdk/bin/java -XX:InitialCodeCacheSize=873K -XX:ReservedCodeCacheSize=995k -Xlog:stubs -version >> [0.001s][info][stubs] StubRoutines (preuniverse stubs) not generated >> [0.003s][info][stubs] StubRoutines (initial stubs) [0x0000ffff5bfb4080, 0x0000ffff5bfb6a10] used: 5388, free: 5252 >> [0.003s][info][stubs] StubRoutines (continuation stubs) [0x0000ffff5bfb7000, 0x0000ffff5bfb7a50] used: 600, free: 2040 >> [0.004s][info][stubs] StubRoutines (final stubs) [0x0000ffff5bff58c0, 0x0000ffff5c00f568] used: 11568, free: 94072 >> [0.007s][warning][codecache] CodeCache is full. Compiler has been disabled. >> [0.007s][warning][codecache] Try increasing the code cache size using -XX:ReservedCodeCacheSize= >> OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. >> OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= >> OpenJDK 64-Bit Server VM warning: C1 initialization failed. Shutting down all compilers >> CodeCache: size=1008Kb used=444Kb max_used=1008Kb free=563Kb >> bounds [0x0000ffff5bfb4000, 0x0000ffff5c0b0000, 0x0000ffff5c0b0000] >> total_blobs=215, nmethods=0, adapters=177, full_count=2 >> Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0 >> [0.007s][warning][stubs ] Ignoring failed allocation of blob Stub Generator compiler_blob under compiler thread >> OpenJDK 64-Bit Server VM warning: C2 initialization failed. Shutting down all compilers >> openjdk version "26-internal" 2026-03-17 >> OpenJDK Runtime Environment (build 26-internal-adhoc.adinn.JDK-8364558) >> OpenJDK 64-Bit Server VM (build 26-internal-adhoc.adinn.JDK-8364558, mixed mode, sharing) >> >> >> That ought to have fixed the problem with test StartOutput. However, after running more times I spotted this (very much less frequent) error exit: >> >> >> [adinn at clarapandy JDK-8364558]$ build/linux-aarch64-server-release/images/jdk/bin/java -XX:InitialCodeCacheSize=873K -XX:ReservedCodeCacheSize=995k -Xlog:stubs -version >> [0.00... > >> Any ideas how to proceed? Shall I still continue wiht the current patch? > > Don't worry about adapters. It is different issue which can happen at any time with very small CodeCache. > >> Also, do we need a better title for the JIRA and PR? > > Yes, please update it. May be: "Failure to generated compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache" @vnkozlov Thanks for the review. Can I get a second one? @shipilev @ashu-mehra ------------- PR Comment: https://git.openjdk.org/jdk/pull/26642#issuecomment-3164325141 From mablakatov at openjdk.org Thu Aug 7 14:34:16 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 7 Aug 2025 14:34:16 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v3] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> Message-ID: On Mon, 4 Aug 2025 17:11:59 GMT, Evgeny Astigeevich wrote: > So you actually have: > ``` > key -> value > (trampoline_id1} -> {get_resolve_static_call_stub, {call_B_offset1, call_B_offset2}} > {trampoline_id2} -> {get_resolve_static_call_stub, {call_C_offset1, call_C_offset2, call_C_offset3}} > {trampoilen_id3} -> {some_runtime_call, {call_RT_offset1, call_RT_offset2, call_RT_offset3}} > ``` Am I right assuming you're suggesting to simplify the logic in `auto emit = [&](address dest, const CodeBuffer::Offsets &offsets)` (so the compiler doesn't need to distinguish between runtime and static calls to update the `dest` for the latter) by storing the effective `dest` addresses in the `_shared_trampoline_requests` dictionary as a part of the value? It seems to me this should work. It unifies emission logic by storing more data up front, when a shared trampoline is requested. However, I'm not sure there's a clearly better approach here. This design leads to some duplication - for instance, `trampoline_id3` should match `some_runtime_call`, if I understand correctly. Taking this into account, I'd suggest keeping the current approach and not proceeding with the alternative implementation described above. > IMO based on `SharedRuntime::resolve_helper`, we can have shared trampolines for `relocInfo::opt_virtual_call_type` and `relocInfo::virtual_call_type`. Their trampolines have the corresponding resolver set. I don't see we write anything specific to a call site into a trampoline. I can see this being implemented as a follow-up PR. For now, I suggest to establish the core mechanism by supporting static call trampolines only. If your assumption holds, a follow-up patch should turn up rather easy to implement and review. Let me know if you spot anything in the proposed design that may hinder future extensions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2260516201 From shade at openjdk.org Thu Aug 7 14:51:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 14:51:16 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache [v2] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 08:50:57 GMT, Andrew Dinn wrote: >> This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > minimize changed lines and align log messages This looks a kludge, but it is a sign that init code is a mess, not that the patch has a problem. From that vantage point, it looks reasonable. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26642#pullrequestreview-3097507955 From mhaessig at openjdk.org Thu Aug 7 14:52:48 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 7 Aug 2025 14:52:48 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v4] In-Reply-To: References: Message-ID: <7SKfaULZBs_ccRipoMMWXKUAASHIhq9um43xaxToBKE=.83db680e-fc44-4be9-8f15-0030e764b4f8@github.com> > This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. > > The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. > > Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. > > Testing: > - [ ] Github Actions > - [ ] tier1, tier2 on all platforms > - [ ] tier3, tier4 and Oracle internal testing on Linux fastdebug > - [ ] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) Manuel H?ssig has updated the pull request incrementally with one additional commit since the last revision: ASSERT ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26023/files - new: https://git.openjdk.org/jdk/pull/26023/files/d50231f9..212afb4d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=02-03 Stats: 18 lines in 2 files changed: 12 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/26023.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26023/head:pull/26023 PR: https://git.openjdk.org/jdk/pull/26023 From mhaessig at openjdk.org Thu Aug 7 14:52:48 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 7 Aug 2025 14:52:48 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v3] In-Reply-To: References: Message-ID: <3dpJoVc2WphcDziUngOFHUf6Th74DAizG9lIclUwPCc=.655a1ba0-9fae-495b-9431-f8a2a43c0073@github.com> On Thu, 7 Aug 2025 01:11:14 GMT, Dean Long wrote: > However, you might want to simply remove _timeout_armed, or put it inside a #ifdef ASSERT, since it is only used in an assert. Good point, thank you. v3 does just that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26023#issuecomment-3164517696 From kvn at openjdk.org Thu Aug 7 15:05:19 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 7 Aug 2025 15:05:19 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v3] In-Reply-To: References: Message-ID: <8anPQBaOB8Rg6Y0-PWU3afCmybFFvpCpJ2zPInDF1-4=.a11f6b77-f4c6-4cb1-b9ec-99a4841ec068@github.com> On Thu, 7 Aug 2025 13:57:50 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > propagate BufferBlob subclass sizes up constructor chain Looks good. Let me test it. ------------- PR Review: https://git.openjdk.org/jdk/pull/26532#pullrequestreview-3097570748 From qamai at openjdk.org Thu Aug 7 15:38:20 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 7 Aug 2025 15:38:20 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v2] In-Reply-To: References: Message-ID: On Fri, 25 Jul 2025 20:09:40 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Updating predicate checks src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 7138: > 7136: // res[255:128] = {src2[127:0] , src1[255:128]} >> SHIFT > 7137: vperm2i128(xtmp, src1, src2, 0x21); > 7138: vpalignr(dst, xtmp, src1, origin, Assembler::AVX_256bit); If the slice amount is exactly 16, I think the `vpalignr` is unnecessary. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 7159: > 7157: void C2_MacroAssembler::vector_slice_64B_op(XMMRegister dst, XMMRegister src1, XMMRegister src2, > 7158: XMMRegister xtmp, int origin, int vlen_enc) { > 7159: if (origin <= 16) { If `origin` is divisible by `4`, then a single `valignd` is enough, am I right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2260699543 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2260702518 From eastigeevich at openjdk.org Thu Aug 7 15:55:19 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 7 Aug 2025 15:55:19 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v3] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <_Mianizit7riu3uwfRRkJUUMa7CDFW-HPtbjHSYXbYI=.9f1e4dd5-d3a6-4e06-aedf-152f2224ca04@github.com> Message-ID: <-2nFSgxvTQDnTjG1yW-Lq1zfj5plegc9RCcplg3SWKs=.fb5feaf6-10ab-413e-bf64-eef5f3f5e8c2@github.com> On Thu, 7 Aug 2025 14:31:09 GMT, Mikhail Ablakatov wrote: > Am I right assuming you're suggesting to simplify ... Yes. > However, I'm not sure there's a clearly better approach here. This design leads to some duplication - for instance, trampoline_id3 should match some_runtime_call, if I understand correctly. There should not be any duplication. Why do you think so? Let's revisit the example. This time `A` has three calls of `Arrays.hashCode`. For large arrays we will use the stub `StubRoutines::aarch64::large_arrays_hashcode`. Your implementation: key -> value {B} -> {relocInfo::static_call_type, {call_B_offset1, call_B_offset2}} {C} -> {relocInfo::static_call_type, {call_C_offset1, call_C_offset2, call_C_offset3}} {StubRoutines::aarch64::large_arrays_hashcode} -> {relocInfo::runtime_call_type, {call_RT_offset1, call_RT_offset2, call_RT_offset3}} Mine proposal: key -> value (B} -> {get_resolve_static_call_stub, {call_B_offset1, call_B_offset2}} {C} -> {get_resolve_static_call_stub, {call_C_offset1, call_C_offset2, call_C_offset3}} {StubRoutines::aarch64::large_arrays_hashcode} -> {StubRoutines::aarch64::large_arrays_hashcode, {call_RT_offset1, call_RT_offset2, call_RT_offset3}} I don't see any duplication. > Taking this into account, I'd suggest keeping the current approach and not proceeding with the alternative implementation described above. I dont understand what you mean "the current approach". Do you mean using `Pair` and checking `relocInfo::relocType`? `Pair` obfuscates what is stored inside it. IMO this reduces readability of the program. `SharedTrampolineRequest` clearly states its purpose and the purpose of its members. > I can see this being implemented as a follow-up PR. I am ok with this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2260752243 From aph at openjdk.org Thu Aug 7 16:20:14 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 7 Aug 2025 16:20:14 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> References: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> Message-ID: On Wed, 6 Aug 2025 09:59:11 GMT, Andrew Haley wrote: >> If the relocation or target address is guaranteed to reside within the CodeCache, we can safely replace a `movz + movk + movk` sequence with a more compact and efficient `adrp + add` instruction pair. >> >> In `MacroAssembler::mov(Register r, Address dest)`, this replacement can be applied if any of the following rules hold: >> >> 1. The relocation type indicates that the address resides within the CodeCache and the necessary patching logic is provided in `fix_relocation_after_move()`. >> 2. The target address is fixed (i.e., does not require relocation) and is within the reachable range for `adrp`. >> >> The patch performs the filtering in `is_relocated_within_codecache()` and `is_adrp_reachable()` to ensure this optimization is applied safely and selectively. > > It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs. > Thanks for your review @theRealAph @adinn . > > > It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs. > > @theRealAph That makes a lot of sense. On Neoverse N1/N2/V1, `ADRP` and `MOVZ/MOVK` have comparable latency and throughput and share the same pipeline, so we can expect these microarchitectures to benefit from this patch. I?m planning to update the patch with an `aarch64` option to enable or disable this replacement based on the target microarchitecture. What do you think? No, the difference isn't worth any added complexity. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3164894932 From adinn at openjdk.org Thu Aug 7 16:26:22 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 16:26:22 GMT Subject: RFR: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache [v2] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 14:49:06 GMT, Aleksey Shipilev wrote: > This looks a kludge, but it is a sign that init code is a mess, not that the patch has a problem. From that vantage point, it looks reasonable. Yes, it's far from perfect and still leaves us with the possibility of a VM exit from a racing C1 thread that wants an adapter. But it is undoubtedly less bad than the status quo. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26642#issuecomment-3164912193 From adinn at openjdk.org Thu Aug 7 16:26:23 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 7 Aug 2025 16:26:23 GMT Subject: Integrated: 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 13:21:01 GMT, Andrew Dinn wrote: > This PR avoids bringing down the VM when a code cache allocation for the stubgen compiler blob is requested by the compiler thread and fails because the code cache is full. Instead the VM is allowed to carry on with the compiler disabled. This pull request has now been integrated. Changeset: 90ea42f7 Author: Andrew Dinn URL: https://git.openjdk.org/jdk/commit/90ea42f716770fd567e4e3b3bf7466fa93964f07 Stats: 17 lines in 2 files changed: 17 ins; 0 del; 0 mod 8364558: Failure to generate compiler stubs from compiler thread should not crash VM when compilation disabled due to full CodeCache Reviewed-by: kvn, shade ------------- PR: https://git.openjdk.org/jdk/pull/26642 From fgao at openjdk.org Thu Aug 7 16:47:19 2025 From: fgao at openjdk.org (Fei Gao) Date: Thu, 7 Aug 2025 16:47:19 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Wed, 6 Aug 2025 09:30:07 GMT, Andrew Haley wrote: >>> Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. >> >> I thought other threads are modifying the data, which is the reason why we needed the ISB. > >> > Unaligned memory access may be non-atomic, although in this case, other threads aren?t modifying the data. >> >> I thought other threads are modifying the data, which is the reason why we needed the ISB. > > Yes, exactly. Other threads are executing this nmethod while we are modifying its static call stub. > > We hold the lock while we're doing this, so any thread racing with us to _modify_ the call stub would be blocked. However, another thread could execute the call we just patched but not fully observe the changes we made to the stub, which is why we need the ISB. > > We have a similar situation with this PR, but with data rather than instruction memory. We need to make sure that any racing thread observes changes to the stub before it observes the patched call to the stub. There is no ordering between instruction and data memory, which is why the current stub uses only instruction memory, keeping everything right. The ISB ensures that the executing thread observes the changed stub: there is a happens-before from the `ICache::invalidate_range` after patching the stub (but before patching the call) to the `isb` at the start of the stub. > > For this PR to be correctly synchronized, we'd need a similar happens-before from patching the constants used in the stub to executing the stub. I can think of a way to do that, but it's fiddly and may not be worth it. Thanks for your reviewing @theRealAph @adinn @dean-long . Also thanks a lot for the explanation ? it?s really helpful and clarified why the `isb` is necessary in the current implementation. This patch is inspired by the trampoline stub design, and I?m trying to understand why that approach works as it does. Happy to be corrected if I?ve got anything wrong. Initially, we generate a trampoline stub to redirect control flow from the caller to the static call resolver. Once the static stub is patched, we also update the trampoline stub, though in some cases the call may bypass the trampoline entirely and jump directly from the caller to the static stub. When the callee is eventually compiled, we reset the trampoline to once again direct the call back to the resolver. After resolution completes, we then update the trampoline to point to the compiled callee. My question is: Why isn?t any explicit memory synchronization required before executing the trampoline stub? How do we ensure that any data loaded by the trampoline stub is fully synchronized across multiple threads? I?d appreciate any insights. Thank you very much! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26638#discussion_r2260874295 From aph at openjdk.org Thu Aug 7 16:52:14 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 7 Aug 2025 16:52:14 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: References: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> Message-ID: On Thu, 7 Aug 2025 16:17:09 GMT, Andrew Haley wrote: > Thanks for your review @theRealAph @adinn . > > > It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs. > > @theRealAph That makes a lot of sense. On Neoverse N1/N2/V1, `ADRP` and `MOVZ/MOVK` have comparable latency and throughput and share the same pipeline, I've done some modelling using llvm-mca and it looks like `adrp; add` is a win on recent Apple processors as well as on Arm processors, so go ahead with making this the default. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3164998150 From duke at openjdk.org Thu Aug 7 16:55:48 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Thu, 7 Aug 2025 16:55:48 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency Message-ID: In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. linux-x64 GCC master real 81.39 user 3352.15 sys 287.49 JDK-8365053 real 81.94 user 3030.24 sys 295.82 linux-x64 Clang master real 43.44 user 2082.93 sys 130.70 JDK-8365053 real 38.44 user 1723.80 sys 117.68 linux-aarch64 GCC master real 1188.08 user 2015.22 sys 175.53 JDK-8365053 real 1019.85 user 1667.45 sys 171.86 ------------- Commit messages: - refresh Changes: https://git.openjdk.org/jdk/pull/26681/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365053 Stats: 41 lines in 1 file changed: 1 ins; 28 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From aph at openjdk.org Thu Aug 7 17:04:17 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 7 Aug 2025 17:04:17 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Thu, 7 Aug 2025 16:44:11 GMT, Fei Gao wrote: > My question is: Why isn?t any explicit memory synchronization required before executing the trampoline stub? How do we ensure that any data loaded by the trampoline stub is fully synchronized across multiple threads? It isn't. Other threads may call the static call resolver multiple times even after it's been patched. The difference is that for a trampoline call there is only one address to load so it can be patched atomically by a single store, and it doesn't matter if the address is stale. We could read/write the { code; method } pair inside a critical section, thus guaranteeing that the operation is correctly synchronized. It might well be a bit faster, but it would be fiddly, especially on Apple silicon. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26638#discussion_r2260916337 From shade at openjdk.org Thu Aug 7 17:06:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 17:06:16 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v3] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 13:57:50 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > propagate BufferBlob subclass sizes up constructor chain This looks fine to me. API got indeed simpler. src/hotspot/share/code/codeBlob.hpp line 410: > 408: static const int ENTRY_COUNT = 4; > 409: private: > 410: AdapterBlob(int size, CodeBuffer* cb, int entry_offset[ENTRY_COUNT]); I was today years old when I realized you can pass const-sized arrays as arguments. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26532#pullrequestreview-3097985317 PR Review Comment: https://git.openjdk.org/jdk/pull/26532#discussion_r2260891380 From shade at openjdk.org Thu Aug 7 17:10:15 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 17:10:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 16:47:04 GMT, Francesco Andreuzzi wrote: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 src/hotspot/share/precompiled/precompiled.hpp line 29: > 27: > 28: // These header files are included in at least 130 C++ files, as of > 29: // measurements made in November 2018. This list excludes files named Keep this comment, just update the date (seeing how threshold is the same?). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2260927049 From shade at openjdk.org Thu Aug 7 17:35:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 17:35:17 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 16:47:04 GMT, Francesco Andreuzzi wrote: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 Also, maybe check in the generation script as well? I think a subfolder here would be fine: https://github.com/openjdk/jdk/tree/master/src/utils. It would be nice if the output of that script could be simply piped into `precompiled.hpp`, so we can use it later without extra work. We can, technically, hook it up a similar way SortIncludes.java is currently done, but I think it is unnecessary at this point. In a perfect world we would not be needing precompiled headers. In less ideal world, having a jtreg test that warns us that a new popular header appeared, or that older header is not as popular anymore, would be handy. Again, this is something out of scope for this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3165135998 From shade at openjdk.org Thu Aug 7 18:03:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Aug 2025 18:03:14 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 16:47:04 GMT, Francesco Andreuzzi wrote: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 On my M1, with some AV software that always get in the way, I am seeing 15% faster build, which saves about half a minute. This is great! CONF=macosx-aarch64-server-fastdebug LOG=info make hotspot 1245.30s user 230.07s system 640% cpu 3:50.49 total CONF=macosx-aarch64-server-fastdebug LOG=info make hotspot 1064.59s user 203.24s system 618% cpu 3:20.09 total ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3165209211 From duke at openjdk.org Thu Aug 7 18:51:04 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Thu, 7 Aug 2025 18:51:04 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v2] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: keep comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/9e0cb7e8..a75494f1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=00-01 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Thu Aug 7 18:51:05 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Thu, 7 Aug 2025 18:51:05 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v2] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 17:07:17 GMT, Aleksey Shipilev wrote: >> Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: >> >> keep comment > > src/hotspot/share/precompiled/precompiled.hpp line 29: > >> 27: >> 28: // These header files are included in at least 130 C++ files, as of >> 29: // measurements made in November 2018. This list excludes files named > > Keep this comment, just update the date (seeing how threshold is the same?). The part about `.inline.hpp` is not relevant anymore though, I've seen significant improvements after including a couple of them. I'll keep the rest. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2261151719 From duke at openjdk.org Thu Aug 7 18:51:06 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Thu, 7 Aug 2025 18:51:06 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v2] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 18:44:29 GMT, Francesco Andreuzzi wrote: >> src/hotspot/share/precompiled/precompiled.hpp line 29: >> >>> 27: >>> 28: // These header files are included in at least 130 C++ files, as of >>> 29: // measurements made in November 2018. This list excludes files named >> >> Keep this comment, just update the date (seeing how threshold is the same?). > > The part about `.inline.hpp` is not relevant anymore though, I've seen significant improvements after including a couple of them. I'll keep the rest. a75494f10f47a032f01e7e967c3b937166d8780d ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2261153641 From dlong at openjdk.org Thu Aug 7 18:59:36 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Aug 2025 18:59:36 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v4] In-Reply-To: <7SKfaULZBs_ccRipoMMWXKUAASHIhq9um43xaxToBKE=.83db680e-fc44-4be9-8f15-0030e764b4f8@github.com> References: <7SKfaULZBs_ccRipoMMWXKUAASHIhq9um43xaxToBKE=.83db680e-fc44-4be9-8f15-0030e764b4f8@github.com> Message-ID: On Thu, 7 Aug 2025 14:52:48 GMT, Manuel H?ssig wrote: >> This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. >> >> The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. >> >> Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. >> >> Testing: >> - [ ] Github Actions >> - [ ] tier1, tier2 on all platforms >> - [ ] tier3, tier4 and Oracle internal testing on Linux fastdebug >> - [ ] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) > > Manuel H?ssig has updated the pull request incrementally with one additional commit since the last revision: > > ASSERT Thinking about _timeout_armed a little more, the fact the the signal handler received TIMEOUT_SIGNAL should be enough. The value of _timeout_armed should be redundant, and your assert could be changed to: assert(false, "compile task timed out"); and _timeout_armed could be removed. It's just an inexact mirror of the timer state. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26023#issuecomment-3165377812 From asmehra at openjdk.org Thu Aug 7 19:04:13 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 7 Aug 2025 19:04:13 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v3] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 13:57:50 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > propagate BufferBlob subclass sizes up constructor chain lgtm ------------- Marked as reviewed by asmehra (Committer). PR Review: https://git.openjdk.org/jdk/pull/26532#pullrequestreview-3098416533 From erikj at openjdk.org Thu Aug 7 19:54:13 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Thu, 7 Aug 2025 19:54:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v2] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 18:51:04 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > keep comment I tried it on Windows. Build time for hotspot was basically the same before and after, so no regression at least. So I'm fine with this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3165525209 From duke at openjdk.org Thu Aug 7 20:00:14 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Thu, 7 Aug 2025 20:00:14 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 17:32:35 GMT, Aleksey Shipilev wrote: > Also, maybe check in the generation script as well? I think a subfolder here would be fine: https://github.com/openjdk/jdk/tree/master/src/utils. It would be nice if the output of that script could be simply piped into `precompiled.hpp`, so we can use it later without extra work. > > We can, technically, hook it up a similar way SortIncludes.java is currently done, but I think it is unnecessary at this point. In a perfect world we would not be needing precompiled headers. In less ideal world, having a jtreg test that warns us that a new popular header appeared, or that older header is not as popular anymore, would be handy. Again, this is something out of scope for this PR. I'll add the generation script, it looks like something that could be useful at some point in the future. I think there should be a human curator though, because the process is not completely automatic. For example, when the threshold is set to 50 I get this compilation error: /jdk/src/hotspot/share/compiler/compilerEvent.cpp:59:36: error: redefinition of 'phase_names' with a different type: 'GrowableArray *' vs 'const char *[100]' static GrowableArray* phase_names = nullptr; ^ /jdk/src/hotspot/share/opto/phasetype.hpp:147:20: note: previous definition is here static const char* phase_names[] = { It could happen with any threshold, even 130 as the codebase evolves. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3165535039 From kvn at openjdk.org Thu Aug 7 21:54:12 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 7 Aug 2025 21:54:12 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v3] In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 13:57:50 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > propagate BufferBlob subclass sizes up constructor chain My testing passed. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26532#pullrequestreview-3098867813 From dholmes at openjdk.org Thu Aug 7 22:24:11 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Aug 2025 22:24:11 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: On Thu, 7 Aug 2025 12:34:21 GMT, Aleksey Shipilev wrote: >>> that might miss updates as well. >> >> Might miss updates to what??? We never follow the pointer. The signalling thread doesn't doesn't do anything that the signallee has to look at, the signallee just gets the signal, checks if it was the timeout target and aborts. >> >> We use fence() for "single variable visibility reasons" in a number of places because it is the only thing that actually forces memory synchronization across all threads. acquire/release just says "if you see this then you must see that", it doesn't do anything to help you "see this" in the first place. > >> Might miss updates to what??? We never follow the pointer. > > Following the pointer is not relevant. We can miss updates to the thread pointer. Note that reader needs at least coherence to break out of potential loops like: > > > while (get_handshake_thread() == nullptr); // burn > > > Acquire gives that, among other things. Overlooking these properties is easy, and this is why we should be sticking to use atomics anywhere we transfer data between threads. > >> We use fence() for "single variable visibility reasons" in a number of places because it is the only thing that actually forces memory synchronization across all threads. > > Well, yes, this part always felt dubious to me. I am restraining myself from going into theoretical argument here, because it would take a lot of time, and would not be helpful for this work. > > Since no performance is on the table, we should just do the most obvious/safe thing. I believe it is a proper style to use `Atomic::load_acquire` and `Atomic::release_store_fence`, instead of one-sided `fence`. Blindly applying acquire/release is not at all the obvious/safe thing to me. I do not understand your reference to `get_handshake_thread()` above - this is not about handshakes. Here we have a very simple situation: - Thread 1: set field; signal target thread - Target thread: receive signal; read field All we need to "transfer" here is the value of "field". acquire/release does not achieve that it only says what happens _if_ you see the new value of field. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2261569500 From dholmes at openjdk.org Thu Aug 7 22:29:13 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Aug 2025 22:29:13 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: On Thu, 7 Aug 2025 22:21:29 GMT, David Holmes wrote: >>> Might miss updates to what??? We never follow the pointer. >> >> Following the pointer is not relevant. We can miss updates to the thread pointer. Note that reader needs at least coherence to break out of potential loops like: >> >> >> while (get_handshake_thread() == nullptr); // burn >> >> >> Acquire gives that, among other things. Overlooking these properties is easy, and this is why we should be sticking to use atomics anywhere we transfer data between threads. >> >>> We use fence() for "single variable visibility reasons" in a number of places because it is the only thing that actually forces memory synchronization across all threads. >> >> Well, yes, this part always felt dubious to me. I am restraining myself from going into theoretical argument here, because it would take a lot of time, and would not be helpful for this work. >> >> Since no performance is on the table, we should just do the most obvious/safe thing. I believe it is a proper style to use `Atomic::load_acquire` and `Atomic::release_store_fence`, instead of one-sided `fence`. > > Blindly applying acquire/release is not at all the obvious/safe thing to me. I do not understand your reference to `get_handshake_thread()` above - this is not about handshakes. Here we have a very simple situation: > - Thread 1: set field; signal target thread > - Target thread: receive signal; read field > All we need to "transfer" here is the value of "field". acquire/release does not achieve that it only says what happens _if_ you see the new value of field. Though as I said elsewhere, although the spec does not require it, I would be very surprised if signal raising did not, in practice, "happen-before" signal delivery. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2261575130 From dholmes at openjdk.org Thu Aug 7 23:40:15 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Aug 2025 23:40:15 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> Message-ID: <25VDLj8CrrdVmydx-vou5S_-vxT_uqh5c0OVPubW7XM=.d515ad77-1b2c-4d26-ab78-554e3fe7b940@github.com> On Thu, 7 Aug 2025 22:26:49 GMT, David Holmes wrote: >> Blindly applying acquire/release is not at all the obvious/safe thing to me. I do not understand your reference to `get_handshake_thread()` above - this is not about handshakes. Here we have a very simple situation: >> - Thread 1: set field; signal target thread >> - Target thread: receive signal; read field >> All we need to "transfer" here is the value of "field". acquire/release does not achieve that it only says what happens _if_ you see the new value of field. > > Though as I said elsewhere, although the spec does not require it, I would be very surprised if signal raising did not, in practice, "happen-before" signal delivery. > acquire/release does not achieve that it only says what happens if you see the new value of field Note I am describing the semantic model for acquire/release in a general sense - as we describe in orderAccess.hpp. An actual implementation may ensure memory operations actually complete before the `release` in a similar way to a `fence` (e.g. X86 `mfence`). A "fence" may only ensure ordering in its abstract description but that suffices, as the store to the field must happen before any of the stores related to raising the signal, which must happen before the signal can actually be delivered, which happens before we load the field. Hence by transitivity we are guaranteed to see the fields written value, by using the fence. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2261646124 From duke at openjdk.org Fri Aug 8 00:43:27 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 00:43:27 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: script. update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/a75494f1..d6cd068b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=01-02 Stats: 167 lines in 3 files changed: 167 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From jjiang at openjdk.org Fri Aug 8 02:31:15 2025 From: jjiang at openjdk.org (John Jiang) Date: Fri, 8 Aug 2025 02:31:15 GMT Subject: Integrated: 8364597: Replace THL A29 Limited with Tencent In-Reply-To: References: Message-ID: On Mon, 4 Aug 2025 07:38:01 GMT, John Jiang wrote: > `THL A29 Limited` was a `Tencent` company but was dissolved. > So, the copyright notes just use `Tencent` directly. This pull request has now been integrated. Changeset: 4c9eadda Author: John Jiang URL: https://git.openjdk.org/jdk/commit/4c9eaddaef83c6ba30e27ae3e0d16caeeec206cb Stats: 67 lines in 58 files changed: 0 ins; 9 del; 58 mod 8364597: Replace THL A29 Limited with Tencent Reviewed-by: jiefu ------------- PR: https://git.openjdk.org/jdk/pull/26614 From qamai at openjdk.org Fri Aug 8 03:10:12 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 8 Aug 2025 03:10:12 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: Message-ID: <4T5YaeiJd_xlg_vaHnee8S4jgAoXuZYvq8bCigtLIJk=.0838e9fd-3b8d-47f7-9e99-320917fcfa0e@github.com> On Fri, 8 Aug 2025 00:43:27 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > script. update src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 73: > 71: paths.filter(Files::isRegularFile) > 72: .filter(path -> { > 73: String name = path.getFileName().toString(); IIUC this means that we count the number of cpp or hpp files that directly include a file, but should we count the number of cpp file that directly or transitively include a file instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2261838083 From dholmes at openjdk.org Fri Aug 8 04:39:11 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 8 Aug 2025 04:39:11 GMT Subject: RFR: 8359222: [asan] jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest triggers stack-buffer-overflow error In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 16:52:23 GMT, Patricio Chilano Mateo wrote: > Please review the following small change. On linux-aarch64, ASan reports an overflow error on a read that occurs when frames are copied from the stack to the heap?specifically during preemption on monitorenter and Object.wait when coming from compiled code. This read is benign and comes from reading either all (freeze fast path case) or part (freeze slow path case) of the `frame::metadata_words_at_bottom` stored in the callee frame (return pc+fp) for the top frame. In these preemption cases the callee frame is the frame of the VM native method being called (e.g. `Runtime1::monitorenter`). Now, the Arm 64-bit ABI doesn't specify a location where the frame record (returnpc+fp) has to be stored within a stack frame and GCC currently chooses to save it at the top of the frame (lowest address). So this causes ASan to treat the memory access in the callee as an overflow access to one of the locals stored in that frame. > > Accessing the callee to read `frame::metadata_words_at_bottom` is only necessary when the top frame is compiled. In that case, the callee is `Continuation.doYield` and those words contain the return pc and fp of the top frame we are freezing, where possibly fp might contain an oop. For the preemption cases mentioned above though, the return pc used for the top frame is always the one from the anchor, not the stack. Also, there is never a callee saved value in fp that needs to be preserved, the fp is always constructed again when thawing the frame. But because of sharing the code path with the compiled frame case we also read these words. I?ve added the full details on where these memory accesses occur in the freeze code, both slow and fast paths, in the bug comments. > > The fix is to adjust these shared paths to avoid the read in the preemption case. I was able to reproduce the issue and have verified it is now fixed. I also ran the patch through mach5 tiers1-7. > > Thanks, > Patricio Seems reasonable - though I dislike that we have to add ASAN and CPU specific code this way. A couple of minor nits. Thanks. src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp line 106: > 104: // For a compiled frame we need to re-read fp out of the frame because it may be an > 105: // oop and we might have had a safepoint in finalize_freeze, after constructing f. > 106: // For stub/native frames the value is not used while freezed, and will be constructed Suggestion: // For stub/native frames the value is not used while frozen, and will be constructed src/hotspot/cpu/x86/continuationFreezeThaw_x86.inline.hpp line 103: > 101: // For a compiled frame we need to re-read fp out of the frame because it may be an > 102: // oop and we might have had a safepoint in finalize_freeze, after constructing f. > 103: // For stub/native frames the value is not used while freezed, and will be constructed Suggestion: // For stub/native frames the value is not used while frozen, and will be constructed src/hotspot/share/runtime/continuationFreezeThaw.cpp line 796: > 794: // frame (lowest address). ASan treats this memory access in the callee as > 795: // an overflow access to one of the locals stored in that frame. For these > 796: // preemption case we don't need to read these words anyways so we avoid it. Suggestion: // preemption cases we don't need to read these words anyways so we avoid it. src/hotspot/share/runtime/continuationFreezeThaw.cpp line 797: > 795: // an overflow access to one of the locals stored in that frame. For these > 796: // preemption case we don't need to read these words anyways so we avoid it. > 797: adjust -= _preempt ? frame::metadata_words_at_bottom : 0; Suggestion: if (_preempt) { adjust = 0; } ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26660#pullrequestreview-3099411836 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2261898883 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2261903871 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2261905677 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2261908614 From dholmes at openjdk.org Fri Aug 8 04:57:16 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 8 Aug 2025 04:57:16 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 19:55:35 GMT, Francesco Andreuzzi wrote: > For example, when the threshold is set to 50 I get this compilation error: How can we possibly get a compilation error when everything should build with PCH disabled? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3166572069 From kbarrett at openjdk.org Fri Aug 8 06:36:19 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 8 Aug 2025 06:36:19 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 00:43:27 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > script. update Changes requested by kbarrett (Reviewer). src/hotspot/share/precompiled/precompiled.hpp line 29: > 27: > 28: // These header files are included in at least 130 C++ files, as of > 29: // measurements made in August 2025. I don't think there's anything particularly special about the number 130. Another thing to consider when a header file has a high include count is whether it's being overincluded. We've had lots of those, and some folks occasionally try to poke at that problem. Some of the removals here look like they might be a result of such efforts. Still another thing to consider is the cost of inclusion. Some files may just be a lot more expensive to process and benefit more for being precompiled. File size can be an indicator, but there are others. Unfortunately, I don't know of a good way to measure this. src/hotspot/share/precompiled/precompiled.hpp line 70: > 68: #include "utilities/ticks.hpp" > 69: > 70: #ifdef TARGET_COMPILER_visCPP All of the reported testing was on Linux. These were included specifically because measurements said their inclusion here was beneficial. And my recollection from previous discussions is that Visual Studio may be the compiler where precompiled headers are most beneficial. src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 70: > 68: // Count inclusion times for each header > 69: Map occurrences = new HashMap<>(); > 70: try (Stream paths = Files.walk(hotspotPath)) { I think walking the source tree is the wrong approach to gathering the data about include counts. I think better is to do a build and then look at the files /hotspot/variant-server/libjvm/objs/*.d. From that one can build a completely accurate count of the transitive inclusions. ------------- PR Review: https://git.openjdk.org/jdk/pull/26681#pullrequestreview-3099600154 PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2262050884 PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2262055386 PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2262041141 From kbarrett at openjdk.org Fri Aug 8 06:42:13 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 8 Aug 2025 06:42:13 GMT Subject: RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v5] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 02:54:22 GMT, Kim Barrett wrote: >> Thanks for implementing nice code for PPC64! I appreciate it! The shared code and the other platforms look fine, too. >> Maybe atomic bitwise operations could be used, but I'm happy with your current solution. > >> Thanks @TheRealMDoerr . I didn't even consider atomic bitwise operations, but that's a good idea. I'm not in a hurry to push this, so if you could provide an atomic bitwise patch for ppc64, I would be happy to include it. In the mean time, I'm still investigating the ZGC regression. If I can figure it out, I might want to include a fix for ZGC in this PR as well. > > Not a review, just a drive-by comment. > We've had Atomic bitops for a while now. > Atomic::fetch_then_{and,or,xor}(ptr, bits [, order]) > Atomic::{and,or,xor}_then_fetch(ptr, bits [, order]) > They haven't been optimized for most (any?) platforms, being based on cmpxchg. > (See all the "Specialize atomic bitset functions for ..." related to > https://bugs.openjdk.org/browse/JDK-8293117.) > Thanks @kimbarrett. I see that the Atomic::fetch_then_XXX() implementation is very similar to what I came up with. The operation I'm doing is sometimes setting a single bit, so fetch_then_or() could be used, but sometimes the operation is setting the other 31 bits, so a new fetch_then_set_with_mask() would need to be added. Oh, yes, I see. This code is setting a bitfield. Yeah, that's not one of the logical atomic primitives, and seems unlikely to be added unless more use-cases can be found. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26399#issuecomment-3166743405 From fgao at openjdk.org Fri Aug 8 08:11:14 2025 From: fgao at openjdk.org (Fei Gao) Date: Fri, 8 Aug 2025 08:11:14 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: References: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> Message-ID: On Thu, 7 Aug 2025 16:49:31 GMT, Andrew Haley wrote: > > I've done some modelling using llvm-mca and it looks like `adrp; add` is a win on recent Apple processors as well as on Arm processors, so go ahead with making this the default. Thanks for testing that ? really great to hear! I?ll update the patch with a constraint for AOT cache shortly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3166943966 From duke at openjdk.org Fri Aug 8 08:24:15 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 08:24:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: <4T5YaeiJd_xlg_vaHnee8S4jgAoXuZYvq8bCigtLIJk=.0838e9fd-3b8d-47f7-9e99-320917fcfa0e@github.com> References: <4T5YaeiJd_xlg_vaHnee8S4jgAoXuZYvq8bCigtLIJk=.0838e9fd-3b8d-47f7-9e99-320917fcfa0e@github.com> Message-ID: On Fri, 8 Aug 2025 03:07:58 GMT, Quan Anh Mai wrote: >> Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: >> >> script. update > > src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 73: > >> 71: paths.filter(Files::isRegularFile) >> 72: .filter(path -> { >> 73: String name = path.getFileName().toString(); > > IIUC this means that we count the number of cpp or hpp files that directly include a file, but should we count the number of cpp file that directly or transitively include a file instead? That would be a much more sensitive number indeed. I'll look into the solution proposed below by @kimbarrett. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2262278283 From shade at openjdk.org Fri Aug 8 08:40:11 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 8 Aug 2025 08:40:11 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <25VDLj8CrrdVmydx-vou5S_-vxT_uqh5c0OVPubW7XM=.d515ad77-1b2c-4d26-ab78-554e3fe7b940@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> <25VDLj8CrrdVmydx-vou5S_-vxT_uqh5c0O VPubW7XM=.d515ad77-1b2c-4d26-ab78-554e3fe7b940@github.com> Message-ID: <_ESoV2tghLGuqSsmR8jYrdpoCpNpipdxtq878TT3vIc=.c21e74c2-2044-4b92-96b6-72e873302091@github.com> On Thu, 7 Aug 2025 23:37:44 GMT, David Holmes wrote: >> Though as I said elsewhere, although the spec does not require it, I would be very surprised if signal raising did not, in practice, "happen-before" signal delivery. > >> acquire/release does not achieve that it only says what happens if you see the new value of field > > Note I am describing the semantic model for acquire/release in a general sense - as we describe in orderAccess.hpp. An actual implementation may ensure memory operations actually complete before the `release` in a similar way to a `fence` (e.g. X86 `mfence`). > > A "fence" may only ensure ordering in its abstract description but that suffices, as the store to the field must happen before any of the stores related to raising the signal, which must happen before the signal can actually be delivered, which happens before we load the field. Hence by transitivity we are guaranteed to see the fields written value, by using the fence. As I said before, getting too deep into theory and how this case might be special is counter-productive here. Honestly, I do _not_ understand the aversion to the idea that a field that is accessed by several threads should be nominally wrapped with `Atomic`. (Also note that some fields in `VMError` already do this.) Whatever the current situation is, that might allow for a low-level naked fence, the situation can change. I strongly believe we should be erring on the side of caution and future-proofness, unless there are other concerns on the table, like performance. There is no other concerns here, AFAICS. If you want a `fence` after the store for whatever reason, that's fine, there is `Atomic::release_store_fence` that gives it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2262314155 From aph at openjdk.org Fri Aug 8 08:57:10 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 8 Aug 2025 08:57:10 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: References: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> Message-ID: On Fri, 8 Aug 2025 08:08:36 GMT, Fei Gao wrote: > > I've done some modelling using llvm-mca and it looks like `adrp; add` is a win on recent Apple processors as well as on Arm processors, so go ahead with making this the default. > > Thanks for testing that ? really great to hear! I?ll update the patch with a constraint for AOT cache shortly. Correction: I'm afraid that the llvm-mca results are nonsense. It says that this sequence movk w0, #0x1234, lsl 16 movk w1, #0x1234, lsl 16 movk w2, #0x1234, lsl 16 movk w3, #0x1234, lsl 16 movk w4, #0x1234, lsl 16 movk w5, #0x1234, lsl 16 movk w6, #0x1234, lsl 16 movk w7, #0x1234, lsl 16 takes 2 clock cycles on Apple M1, but Dougall Johnson measured real hardware executing this at 1 clock cycle. I'm not going to believe any more without numbers we can trust. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3167078839 From adinn at openjdk.org Fri Aug 8 09:12:15 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 8 Aug 2025 09:12:15 GMT Subject: RFR: 8364269: Simplify code cache API by storing adapter entry offsets in blob [v3] In-Reply-To: References: Message-ID: <3Ev99DymxsVlMZlKiLHWYChb2Lg74ryHHHGdQtfuK98=.4407433b-5f84-45e3-a7f2-34094ea1280b@github.com> On Thu, 7 Aug 2025 13:57:50 GMT, Andrew Dinn wrote: >> The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > propagate BufferBlob subclass sizes up constructor chain Thanks all for the reviews. Integrating now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26532#issuecomment-3167124598 From adinn at openjdk.org Fri Aug 8 09:15:22 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 8 Aug 2025 09:15:22 GMT Subject: Integrated: 8364269: Simplify code cache API by storing adapter entry offsets in blob In-Reply-To: References: Message-ID: On Tue, 29 Jul 2025 12:15:48 GMT, Andrew Dinn wrote: > The AOT stub blob save and restore API handles Adapter blobs differently to other single-stub blobs by passing the associated entries in a separate auxiliary array. Storing the entry offsets in the blob, as happens with other generated blobs, simplifies the current save/restore API and implementation. It also makes it easier to define and implement an API extension supporting save and restore of multi-stub (StubGen) blobs. This pull request has now been integrated. Changeset: 241808e1 Author: Andrew Dinn URL: https://git.openjdk.org/jdk/commit/241808e13fb032b0ec192e0b7ff94891a653ac94 Stats: 102 lines in 5 files changed: 30 ins; 42 del; 30 mod 8364269: Simplify code cache API by storing adapter entry offsets in blob Reviewed-by: kvn, shade, asmehra ------------- PR: https://git.openjdk.org/jdk/pull/26532 From duke at openjdk.org Fri Aug 8 10:30:12 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 10:30:12 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 04:54:32 GMT, David Holmes wrote: > > For example, when the threshold is set to 50 I get this compilation error: > > How can we possibly get a compilation error when everything should build with PCH disabled? Possibly some includes are inside namespaces or classes? I would have to check on this though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3167372745 From mhaessig at openjdk.org Fri Aug 8 10:51:42 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 8 Aug 2025 10:51:42 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v5] In-Reply-To: References: Message-ID: <6gq4iIBw4RIqqPvmAf2MHnKrmYHwOdWdH1fz1bFaCGA=.57906956-460f-4a1d-9e3e-fbf91a7974e2@github.com> > This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. > > The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. > > Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. > > Testing: > - [x] Github Actions > - [x] tier1, tier2 on all platforms > - [ ] tier3, tier4 and Oracle internal testing on Linux fastdebug > - [ ] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - Merge branch 'master' into JDK-8308094-timeout - Rename _timer - remove _timeout_armed - ASSERT - Merge branch 'master' into JDK-8308094-timeout - No acquire release semantics - Factor Linux specific timeout functionality out of share/ - Move timeout disarm above if - Merge branch 'master' into JDK-8308094-timeout - Fix SIGALRM test - ... and 1 more: https://git.openjdk.org/jdk/compare/80c8bd84...8bb5eb7a ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26023/files - new: https://git.openjdk.org/jdk/pull/26023/files/212afb4d..8bb5eb7a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=03-04 Stats: 4769 lines in 139 files changed: 3372 ins; 855 del; 542 mod Patch: https://git.openjdk.org/jdk/pull/26023.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26023/head:pull/26023 PR: https://git.openjdk.org/jdk/pull/26023 From duke at openjdk.org Fri Aug 8 11:15:46 2025 From: duke at openjdk.org (Ruben) Date: Fri, 8 Aug 2025 11:15:46 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 Message-ID: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. ------------- Commit messages: - 8365047: Remove exception handler stub code in C2 Changes: https://git.openjdk.org/jdk/pull/26678/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26678&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365047 Stats: 204 lines in 16 files changed: 19 ins; 180 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/26678.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26678/head:pull/26678 PR: https://git.openjdk.org/jdk/pull/26678 From duke at openjdk.org Fri Aug 8 11:27:53 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 11:27:53 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v4] In-Reply-To: References: Message-ID: <9F_mpCqxuTVGhOfdZn2M32xj7XEfOHlHg-m7yQB6c-A=.4ef5971f-fac0-442f-ac1e-ca3f6fefe922@github.com> > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: use .d files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/d6cd068b..5672a1b1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=02-03 Stats: 55 lines in 2 files changed: 16 ins; 18 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Fri Aug 8 11:36:04 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 11:36:04 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v5] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: - nl - refresh. not needed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/5672a1b1..5c8179d8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=03-04 Stats: 416 lines in 2 files changed: 408 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Fri Aug 8 11:39:12 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 11:39:12 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: Message-ID: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> On Fri, 8 Aug 2025 06:23:43 GMT, Kim Barrett wrote: >> Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: >> >> script. update > > src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 70: > >> 68: // Count inclusion times for each header >> 69: Map occurrences = new HashMap<>(); >> 70: try (Stream paths = Files.walk(hotspotPath)) { > > I think walking the source tree is the wrong approach to gathering the data > about include counts. I think better is to do a build and then look at the > files /hotspot/variant-server/libjvm/objs/*.d. From that one can build > a completely accurate count of the transitive inclusions. I followed this hint, the latest version of the script checks the `.d` files. Two observations: - The magic number is now `2460` (more includes are taken into account, which makes sense) - There is much less wiggle room, 2461 includes nothing, 2459 includes too much - Runtime does not seem to be affected negatively or positively If anybody is interested, this file contains the inclusion count for each source file: [inclusions_count.txt](https://github.com/user-attachments/files/21682561/inclusions_count.txt) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2262707516 From aph at openjdk.org Fri Aug 8 11:40:13 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 8 Aug 2025 11:40:13 GMT Subject: RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD In-Reply-To: References: <8G8Djp2bD_zzy3Jm5R3oMoTSrWrbhbCMiIHC5Kl9ufY=.eaed94c1-416a-4c95-a36e-8ba8450d7db4@github.com> Message-ID: <3saG-GQybxPQeykQ-Dqu-HvcXpq5q4N51ay93V3kfPU=.f40f508e-1ec1-41a4-8afa-42180471b49c@github.com> On Fri, 8 Aug 2025 08:55:03 GMT, Andrew Haley wrote: > I'm not going to believe any more without numbers we can trust. Sorry, it's 6 movz/movk per cycle, which I just confirmed by measuring it myself. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3167573763 From dholmes at openjdk.org Fri Aug 8 12:24:10 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 8 Aug 2025 12:24:10 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: <_ESoV2tghLGuqSsmR8jYrdpoCpNpipdxtq878TT3vIc=.c21e74c2-2044-4b92-96b6-72e873302091@github.com> References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> <25VDLj8CrrdVmydx-vou5S_-vxT_uqh5c0O VPubW7XM=.d515ad77-1b2c-4d26-ab78-554e3fe7b940@github.com> <_ESoV2tghLGuqSsmR8jYrdpoCpNpipdxtq878TT3vIc=.c21e74c2-2044-4b92-96b6-72e873302091@github.com> Message-ID: On Fri, 8 Aug 2025 08:37:13 GMT, Aleksey Shipilev wrote: >>> acquire/release does not achieve that it only says what happens if you see the new value of field >> >> Note I am describing the semantic model for acquire/release in a general sense - as we describe in orderAccess.hpp. An actual implementation may ensure memory operations actually complete before the `release` in a similar way to a `fence` (e.g. X86 `mfence`). >> >> A "fence" may only ensure ordering in its abstract description but that suffices, as the store to the field must happen before any of the stores related to raising the signal, which must happen before the signal can actually be delivered, which happens before we load the field. Hence by transitivity we are guaranteed to see the fields written value, by using the fence. > > As I said before, getting too deep into theory and how this case might be special is counter-productive here. > > Honestly, I do _not_ understand the aversion to the idea that a field that is accessed by several threads should be nominally wrapped with `Atomic`. (Also note that some fields in `VMError` already do this.) Whatever the current situation is, that might allow for a low-level naked fence, the situation can change. I strongly believe we should be erring on the side of caution and future-proofness, unless there are other concerns on the table, like performance. There is no other concerns here, AFAICS. If you want a `fence` after the store for whatever reason, that's fine, there is `Atomic::release_store_fence` that gives it. It is not the "atomic" that is the issue it is the inappropriate and unnecessary use of acquire/release semantics. When people look at this code later they will wonder what it is that we are trying to coordinate - when the answer is "nothing". That just causes confusion. Synchronization and concurrency is hard enough with obfuscating things by sprinkling the wrong kind of synchronization constructs around "just to be safe". That isn't future-proofing things. If in the future we have different synchronization requirements then those requirements need to be understood and the correct code inserted. Maybe in the future we will need a mutex - who knows. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2262830006 From duke at openjdk.org Fri Aug 8 12:25:30 2025 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Aug 2025 12:25:30 GMT Subject: RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic [v4] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported. > > Thank you, > -Yuri Gaevsky > > **Correctness checks:** > hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4. Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: removed obsolete code. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17750/files - new: https://git.openjdk.org/jdk/pull/17750/files/0e1d7527..cfed55df Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17750&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17750&range=02-03 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17750.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17750/head:pull/17750 PR: https://git.openjdk.org/jdk/pull/17750 From kbarrett at openjdk.org Fri Aug 8 12:45:13 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 8 Aug 2025 12:45:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> Message-ID: <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> On Fri, 8 Aug 2025 11:36:04 GMT, Francesco Andreuzzi wrote: >> src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 70: >> >>> 68: // Count inclusion times for each header >>> 69: Map occurrences = new HashMap<>(); >>> 70: try (Stream paths = Files.walk(hotspotPath)) { >> >> I think walking the source tree is the wrong approach to gathering the data >> about include counts. I think better is to do a build and then look at the >> files /hotspot/variant-server/libjvm/objs/*.d. From that one can build >> a completely accurate count of the transitive inclusions. > > I followed this hint, the latest version of the script checks the `.d` files. > > Two observations: > - The magic number is now `2460` (more includes are taken into account, which makes sense) > - There is much less wiggle room, 2461 includes nothing, 2459 includes too much > - Runtime does not seem to be affected negatively or positively > > If anybody is interested, this file contains the inclusion count for each source file: [inclusions_count.txt](https://github.com/user-attachments/files/21682561/inclusions_count.txt) Something seems wrong with these numbers. - There are 538 headers with include counts of 2460. - There are only 1121 .o.cmdline files. So how do we get include counts that are greater (by more than a factor of 2) than the number of .d files? Note that precompiled.hpp has an include count of 2456. There's 1 .d file that depends on everything - BUILD_LIBJVM.d. That probably ought to be excluded from counting, although I guess it should just add one to every file. Also, precompiled.hpp has an include count of 2456, so not the maximum count, but close. Which is itself weird that it wouldn't be the most included file. But it's hard to guess what might be going on with that until the surprising high include count > number of .d files is understood. That include count listing would also be more useful of sorted by count. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2262874254 From shade at openjdk.org Fri Aug 8 13:35:13 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 8 Aug 2025 13:35:13 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> <25VDLj8CrrdVmydx-vou5S_-vxT_uqh5c0O VPubW7XM=.d515ad77-1b2c-4d26-ab78-554e3fe7b940@github.com> <_ESoV2tghLGuqSsmR8jYrdpoCpNpipdxtq878TT3vIc=.c21e74c2-2044-4b92-96b6-72e873302091@github.com> Message-ID: On Fri, 8 Aug 2025 12:21:33 GMT, David Holmes wrote: >> As I said before, getting too deep into theory and how this case might be special is counter-productive here. >> >> Honestly, I do _not_ understand the aversion to the idea that a field that is accessed by several threads should be nominally wrapped with `Atomic`. (Also note that some fields in `VMError` already do this.) Whatever the current situation is, that might allow for a low-level naked fence, the situation can change. I strongly believe we should be erring on the side of caution and future-proofness, unless there are other concerns on the table, like performance. There is no other concerns here, AFAICS. If you want a `fence` after the store for whatever reason, that's fine, there is `Atomic::release_store_fence` that gives it. > > It is not the "atomic" that is the issue it is the inappropriate and unnecessary use of acquire/release semantics. When people look at this code later they will wonder what it is that we are trying to coordinate - when the answer is "nothing". That just causes confusion. Synchronization and concurrency is hard enough without obfuscating things by sprinkling the wrong kind of synchronization constructs around "just to be safe". That isn't future-proofing things. If in the future we have different synchronization requirements then those requirements need to be understood and the correct code inserted. Maybe in the future we will need a mutex - who knows. You say "inappropriate and unnecessary", I say "conservative". I think this is an opinion stalemate, so I'll just recluse myself from this conversation, and let someone else to break this tie. There is no confusion to me: we pass something between the threads without clear additional synchronization in sight (i.e. mutexes) => we do `Atomic`-s. Deciding on the memory ordering for Atomics is: unless we see the need for performance, we default to conservative (acqrel and minimum for pointers, seqcst if we are are paranoid and cannot guarantee it is one-sided transfer) ordering, to avoid future accidents. If anything, a naked `fence()` raises much more questions for me in comparison with `Atomic` accesses. Mostly because fences are significantly more low-level than Atomics. When I look at this code from the perspective of bystander, these are the questions that pop into my mind: Why it is only the fence on the write side? Shouldn't there be a fence on the reader side somewhere then? Maybe we are optimizing performance? Are we relying on some other (partial) synchronization? Are we stacking with some other fence? Are there arch-specific details about what fences actually do? Etc. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2262992787 From duke at openjdk.org Fri Aug 8 14:10:15 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 14:10:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> Message-ID: <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> On Fri, 8 Aug 2025 12:42:40 GMT, Kim Barrett wrote: > Note that precompiled.hpp has an include count of 2456. There's 1 .d file that depends on everything - BUILD_LIBJVM.d. That probably ought to be excluded from counting, although I guess it should just add one to every file. Apparently `BUILD_LIBJVM.d` mentions all files multiple times: % grep "utilities/waitBarrier.hpp" build/clang/hotspot/variant-server/libjvm/objs/BUILD_LIBJVM.d | wc -l 1231 It's definitely reasonable to exclude it, I did it in the latest revision of the script. Now, the numbers seem more reasonable: % ls build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | wc -l 1232 % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | \ grep -v BUILD | sort | uniq | wc -l 1230 % grep "code/codeCache.hpp" inclusions_count.txt code/codeCache.hpp=1230 Updated [inclusions_count.txt](https://github.com/user-attachments/files/21685761/inclusions_count.txt) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263090433 From duke at openjdk.org Fri Aug 8 14:21:37 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 14:21:37 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v6] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with five additional commits since the last revision: - refresh - refresh - exclude precompiled - nn - exclude BUILD, cpp, compiler specific ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/5c8179d8..dba3a6aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=04-05 Stats: 45 lines in 2 files changed: 28 ins; 10 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Fri Aug 8 14:36:00 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 14:36:00 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v7] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: refresh ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/dba3a6aa..3546dd18 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=05-06 Stats: 20 lines in 1 file changed: 0 ins; 20 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Fri Aug 8 14:50:01 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 14:50:01 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: - two times might be too much - ops ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/3546dd18..71950aed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=06-07 Stats: 6 lines in 2 files changed: 1 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From ayang at openjdk.org Fri Aug 8 14:57:20 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 8 Aug 2025 14:57:20 GMT Subject: RFR: 8346005: Parallel: Incorrect page size calculation with UseLargePages Message-ID: Refactor the heap-space and OS memory interface code to clearly separate two related but distinct concepts: `alignment` and `os-page-size`. These are now represented as two fields in `PSVirtualSpace`. The parallel heap consists of four spaces: old, eden, from, and to. The first belongs to the old generation, while the latter three belong to the young generation. The size of any space is always aligned to `alignment`, which also determines the unit for resizing. To keep the implementation simple while allowing flexible per-space commit and uncommit operations, each space must contain at least one OS page. As a result, `alignment` is always greater than or equal to `os-page-size`. When using explicit large pages -- which require pre-allocating large pages before the VM starts -- the actual OS page size is not known until the heap has been reserved. The additional logic in `ParallelScavengeHeap::initialize` detects the OS page size in use and adjusts `alignment` if necessary. Test: tier1?8 ------------- Commit messages: - pgc-largepage Changes: https://git.openjdk.org/jdk/pull/26700/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26700&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346005 Stats: 230 lines in 15 files changed: 82 ins; 91 del; 57 mod Patch: https://git.openjdk.org/jdk/pull/26700.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26700/head:pull/26700 PR: https://git.openjdk.org/jdk/pull/26700 From qamai at openjdk.org Fri Aug 8 15:05:14 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 8 Aug 2025 15:05:14 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> Message-ID: On Fri, 8 Aug 2025 14:08:02 GMT, Francesco Andreuzzi wrote: >> Something seems wrong with these numbers. >> >> - There are 538 headers with include counts of 2460. >> - There are only 1121 .o.cmdline files. >> >> So how do we get include counts that are greater (by more than a factor of 2) >> than the number of .d files? >> >> Note that precompiled.hpp has an include count of 2456. There's 1 .d file that >> depends on everything - BUILD_LIBJVM.d. That probably ought to be excluded >> from counting, although I guess it should just add one to every file. >> >> Also, precompiled.hpp has an include count of 2456, so not the maximum count, >> but close. Which is itself weird that it wouldn't be the most included file. >> But it's hard to guess what might be going on with that until the surprising >> high include count > number of .d files is understood. >> >> That include count listing would also be more useful of sorted by count. > >> Note that precompiled.hpp has an include count of 2456. There's 1 .d file that > depends on everything - BUILD_LIBJVM.d. That probably ought to be excluded > from counting, although I guess it should just add one to every file. > > Apparently `BUILD_LIBJVM.d` mentions all files multiple times: > > % grep "utilities/waitBarrier.hpp" build/clang/hotspot/variant-server/libjvm/objs/BUILD_LIBJVM.d | wc -l > 1231 > > > It's definitely reasonable to exclude it, I did it in the latest revision of the script. > > Now, the numbers seem more reasonable: > > % ls build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | wc -l > 1232 > > % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | \ > grep -v BUILD | sort | uniq | wc -l > 1230 > > % grep "code/codeCache.hpp" inclusions_count.txt > code/codeCache.hpp=1230 > > > Updated [inclusions_count.txt](https://github.com/user-attachments/files/21685761/inclusions_count.txt) I assume this means that other `.d` files might include a header multiple times, too. Is it reasonable to count only the number of files containing a header instead of number of appearances? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263225465 From duke at openjdk.org Fri Aug 8 15:20:15 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 15:20:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> Message-ID: On Fri, 8 Aug 2025 15:02:56 GMT, Quan Anh Mai wrote: > I assume this means that other .d files might include a header multiple times, too. >From the following two numbers, I deduce `code/codeCache.hpp` is (transitively) included in 1230 files, with each inclusion coming from a unique `.d` file. Am I missing something? % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | wc -l 1230 % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | uniq | wc -l 1230 I haven't checked every single file, but I'd expect each include to appear at most once in each `.d` file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263257970 From qamai at openjdk.org Fri Aug 8 15:27:13 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 8 Aug 2025 15:27:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> Message-ID: On Fri, 8 Aug 2025 15:17:06 GMT, Francesco Andreuzzi wrote: >> I assume this means that other `.d` files might include a header multiple times, too. Is it reasonable to count only the number of files containing a header instead of number of appearances? > >> I assume this means that other .d files might include a header multiple times, too. > > From the following two numbers, I deduce `code/codeCache.hpp` is (transitively) included in 1230 files, with each inclusion coming from a unique `.d` file. Am I missing something? > > > % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | wc -l > 1230 > > % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | uniq | wc -l > 1230 > > > I haven't checked every single file, but I'd expect each include to appear at most once in each `.d` file. Thanks for the clarification, I think that it is fine, then, although a comment explaining in the code would be great. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263272948 From qamai at openjdk.org Fri Aug 8 15:35:15 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 8 Aug 2025 15:35:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: Message-ID: <7fGzfcVu-qJat5d53jOHN6hJmLKT26IKA_Q6Z9LySVc=.8225de02-89b8-4523-9aaf-9a402a2d7730@github.com> On Fri, 8 Aug 2025 06:30:11 GMT, Kim Barrett wrote: >> Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: >> >> script. update > > src/hotspot/share/precompiled/precompiled.hpp line 29: > >> 27: >> 28: // These header files are included in at least 130 C++ files, as of >> 29: // measurements made in August 2025. > > I don't think there's anything particularly special about the number 130. > > Another thing to consider when a header file has a high include count is > whether it's being overincluded. We've had lots of those, and some folks > occasionally try to poke at that problem. Some of the removals here look like > they might be a result of such efforts. > > Still another thing to consider is the cost of inclusion. Some files may just > be a lot more expensive to process and benefit more for being precompiled. > File size can be an indicator, but there are others. Unfortunately, I don't > know of a good way to measure this. You need to modify the comment here, too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263274537 From duke at openjdk.org Fri Aug 8 15:35:17 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 15:35:17 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 15:28:41 GMT, Quan Anh Mai wrote: >> Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: >> >> - two times might be too much >> - ops > > src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 85: > >> 83: try { >> 84: // The first line contains the object name >> 85: return Files.lines(file).skip(1); > > Or maybe `return Files.lines(file).skip(1).distinct()`? `distinct()` has some overhead, do we need it? Do we expect duplicate headers in a file? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263292750 From duke at openjdk.org Fri Aug 8 15:35:16 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 15:35:16 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: <7fGzfcVu-qJat5d53jOHN6hJmLKT26IKA_Q6Z9LySVc=.8225de02-89b8-4523-9aaf-9a402a2d7730@github.com> References: <7fGzfcVu-qJat5d53jOHN6hJmLKT26IKA_Q6Z9LySVc=.8225de02-89b8-4523-9aaf-9a402a2d7730@github.com> Message-ID: On Fri, 8 Aug 2025 15:25:15 GMT, Quan Anh Mai wrote: >> src/hotspot/share/precompiled/precompiled.hpp line 29: >> >>> 27: >>> 28: // These header files are included in at least 130 C++ files, as of >>> 29: // measurements made in August 2025. >> >> I don't think there's anything particularly special about the number 130. >> >> Another thing to consider when a header file has a high include count is >> whether it's being overincluded. We've had lots of those, and some folks >> occasionally try to poke at that problem. Some of the removals here look like >> they might be a result of such efforts. >> >> Still another thing to consider is the cost of inclusion. Some files may just >> be a lot more expensive to process and benefit more for being precompiled. >> File size can be an indicator, but there are others. Unfortunately, I don't >> know of a good way to measure this. > > You need to modify the comment here, too. Right, I'll wait just a bit so the discussion on how to approach the problem stabilizes :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263289968 From qamai at openjdk.org Fri Aug 8 15:35:13 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 8 Aug 2025 15:35:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 14:50:01 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - two times might be too much > - ops Marked as reviewed by qamai (Committer). src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 85: > 83: try { > 84: // The first line contains the object name > 85: return Files.lines(file).skip(1); Or maybe `return Files.lines(file).skip(1).distinct()`? ------------- PR Review: https://git.openjdk.org/jdk/pull/26681#pullrequestreview-3101235921 PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263282801 From qamai at openjdk.org Fri Aug 8 16:31:15 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 8 Aug 2025 16:31:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: References: Message-ID: <57LBOn04BwPwiJdf_UUqgud4h9qhj38ga2bT56akoIc=.6dcf1641-1d82-46a6-b157-1fb7a33b2342@github.com> On Fri, 8 Aug 2025 15:32:32 GMT, Francesco Andreuzzi wrote: >> src/utils/PrecompiledHeaders/PrecompiledHeaders.java line 85: >> >>> 83: try { >>> 84: // The first line contains the object name >>> 85: return Files.lines(file).skip(1); >> >> Or maybe `return Files.lines(file).skip(1).distinct()`? > > `distinct()` has some overhead, do we need it? Do we expect duplicate headers in a file? It's not like stream is the most performant method to do anything anyway. I think the clarification is worth the additional overhead, especially when this file is not run very frequently. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263469891 From erikj at openjdk.org Fri Aug 8 17:34:13 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 8 Aug 2025 17:34:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 14:50:01 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - two times might be too much > - ops The latest version fails for me. I'm guessing because we exclude shenandoah by default for OracleJDK builds so having those headers in precompiled.hpp messes things up. I think we need to be careful with headers that are part of optional features. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3168810638 From dlong at openjdk.org Fri Aug 8 18:48:13 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 8 Aug 2025 18:48:13 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations [v5] In-Reply-To: <6gq4iIBw4RIqqPvmAf2MHnKrmYHwOdWdH1fz1bFaCGA=.57906956-460f-4a1d-9e3e-fbf91a7974e2@github.com> References: <6gq4iIBw4RIqqPvmAf2MHnKrmYHwOdWdH1fz1bFaCGA=.57906956-460f-4a1d-9e3e-fbf91a7974e2@github.com> Message-ID: On Fri, 8 Aug 2025 10:51:42 GMT, Manuel H?ssig wrote: >> This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. >> >> The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. >> >> Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. >> >> Testing: >> - [x] Github Actions >> - [x] tier1, tier2 on all platforms >> - [x] tier3, tier4 and Oracle internal testing on Linux fastdebug >> - [x] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) > > Manuel H?ssig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - Merge branch 'master' into JDK-8308094-timeout > - Rename _timer > - remove _timeout_armed > - ASSERT > - Merge branch 'master' into JDK-8308094-timeout > - No acquire release semantics > - Factor Linux specific timeout functionality out of share/ > - Move timeout disarm above if > - Merge branch 'master' into JDK-8308094-timeout > - Fix SIGALRM test > - ... and 1 more: https://git.openjdk.org/jdk/compare/2eac5347...8bb5eb7a Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26023#pullrequestreview-3101863176 From liach at openjdk.org Fri Aug 8 18:54:13 2025 From: liach at openjdk.org (Chen Liang) Date: Fri, 8 Aug 2025 18:54:13 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> References: <7f8HeMWpFC1Y1JWDmdi5lMU0kgT9z-yQFLDOXuSUucU=.40edcd18-9311-41a1-b007-81bf4a11bd7d@github.com> Message-ID: On Tue, 5 Aug 2025 14:11:06 GMT, Roger Riggs wrote: >> Chen Liang has updated the pull request incrementally with one additional commit since the last revision: >> >> Update NPE per roger review > > src/java.base/share/classes/java/lang/NullPointerException.java line 142: > >> 140: } >> 141: >> 142: // Methods below must be called in object monitor > > This kind of warning should be on every method declaration, if separated from the method doc, it can be overlooked by future maintainers. Done. > src/java.base/share/classes/java/lang/NullPointerException.java line 146: > >> 144: private void ensureMessageComputed() { >> 145: if ((extendedMessageState & (MESSAGE_COMPUTED | CONSTRUCTOR_FINISHED)) == CONSTRUCTOR_FINISHED) { >> 146: int stackOffset = (extendedMessageState & STACK_OFFSET_MASK) >> STACK_OFFSET_SHIFT; > > This would be more readable if private utility methods were added that encapsulated the shift and masking, both for extraction and construction. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2263783470 PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2263782482 From liach at openjdk.org Fri Aug 8 18:54:14 2025 From: liach at openjdk.org (Chen Liang) Date: Fri, 8 Aug 2025 18:54:14 GMT Subject: RFR: 8364588: Export the NPE backtracking functionality to general null-checking APIs [v4] In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 10:14:07 GMT, Johan Sj?len wrote: >> Chen Liang has updated the pull request incrementally with one additional commit since the last revision: >> >> Update NPE per roger review > > src/java.base/share/classes/java/lang/NullPointerException.java line 159: > >> 157: /// configurations to trace how a particular argument, which turns out to >> 158: /// be `null`, was evaluated. >> 159: /// 2. `searchSlot < 0`, stack offset is 0 (a call to the nullary constructor) > > Can `searchSlot == -1` here? I think the code handles all < 0 cases so using that is fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26600#discussion_r2263781125 From dcubed at openjdk.org Fri Aug 8 18:58:12 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 8 Aug 2025 18:58:12 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> Message-ID: On Fri, 8 Aug 2025 15:24:46 GMT, Quan Anh Mai wrote: >>> I assume this means that other .d files might include a header multiple times, too. >> >> From the following two numbers, I deduce `code/codeCache.hpp` is (transitively) included in 1230 files, with each inclusion coming from a unique `.d` file. Am I missing something? >> >> >> % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | wc -l >> 1230 >> >> % grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | uniq | wc -l >> 1230 >> >> >> I haven't checked every single file, but I'd expect each include to appear at most once in each `.d` file. > > Thanks for the clarification, I think that it is fine, then, although a comment explaining in the code would be great. I suspect that the multiple includes reported in the .d files are accurate. The .d file appears to report the number of times we `#include` a file, however, because if the include-guard constructs we use, the content is pulled in only once. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2263788850 From pchilanomate at openjdk.org Fri Aug 8 19:12:19 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 8 Aug 2025 19:12:19 GMT Subject: RFR: 8359222: [asan] jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest triggers stack-buffer-overflow error [v2] In-Reply-To: References: Message-ID: <2QBp85PQqypHeK4V4VbA1lZH9W3vRj_0wPfnX-Kec5c=.80f47cfc-2bd0-4c2b-babe-d0e30ea6a6b1@github.com> > Please review the following small change. On linux-aarch64, ASan reports an overflow error on a read that occurs when frames are copied from the stack to the heap?specifically during preemption on monitorenter and Object.wait when coming from compiled code. This read is benign and comes from reading either all (freeze fast path case) or part (freeze slow path case) of the `frame::metadata_words_at_bottom` stored in the callee frame (return pc+fp) for the top frame. In these preemption cases the callee frame is the frame of the VM native method being called (e.g. `Runtime1::monitorenter`). Now, the Arm 64-bit ABI doesn't specify a location where the frame record (returnpc+fp) has to be stored within a stack frame and GCC currently chooses to save it at the top of the frame (lowest address). So this causes ASan to treat the memory access in the callee as an overflow access to one of the locals stored in that frame. > > Accessing the callee to read `frame::metadata_words_at_bottom` is only necessary when the top frame is compiled. In that case, the callee is `Continuation.doYield` and those words contain the return pc and fp of the top frame we are freezing, where possibly fp might contain an oop. For the preemption cases mentioned above though, the return pc used for the top frame is always the one from the anchor, not the stack. Also, there is never a callee saved value in fp that needs to be preserved, the fp is always constructed again when thawing the frame. But because of sharing the code path with the compiled frame case we also read these words. I?ve added the full details on where these memory accesses occur in the freeze code, both slow and fast paths, in the bug comments. > > The fix is to adjust these shared paths to avoid the read in the preemption case. I was able to reproduce the issue and have verified it is now fixed. I also ran the patch through mach5 tiers1-7. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: address David's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26660/files - new: https://git.openjdk.org/jdk/pull/26660/files/1e098c54..a000a06b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26660&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26660&range=00-01 Stats: 6 lines in 3 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26660.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26660/head:pull/26660 PR: https://git.openjdk.org/jdk/pull/26660 From pchilanomate at openjdk.org Fri Aug 8 19:19:12 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 8 Aug 2025 19:19:12 GMT Subject: RFR: 8359222: [asan] jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest triggers stack-buffer-overflow error [v2] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 04:36:42 GMT, David Holmes wrote: > Seems reasonable - though I dislike that we have to add ASAN and CPU specific code this way. > Initially I added it to avoid the extra branch in this fast path except when using ASan builds. I checked and the compiler is smart enough to avoid that and just generates an extra load of `_preempt` and arithmetic instructions that use a register instead of a constant. So maybe we could make it unconditional. But I thought the macros?also make it clear this is only needed for this type of build. > src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp line 106: > >> 104: // For a compiled frame we need to re-read fp out of the frame because it may be an >> 105: // oop and we might have had a safepoint in finalize_freeze, after constructing f. >> 106: // For stub/native frames the value is not used while freezed, and will be constructed > > Suggestion: > > // For stub/native frames the value is not used while frozen, and will be constructed Fixed. > src/hotspot/cpu/x86/continuationFreezeThaw_x86.inline.hpp line 103: > >> 101: // For a compiled frame we need to re-read fp out of the frame because it may be an >> 102: // oop and we might have had a safepoint in finalize_freeze, after constructing f. >> 103: // For stub/native frames the value is not used while freezed, and will be constructed > > Suggestion: > > // For stub/native frames the value is not used while frozen, and will be constructed Fixed. > src/hotspot/share/runtime/continuationFreezeThaw.cpp line 796: > >> 794: // frame (lowest address). ASan treats this memory access in the callee as >> 795: // an overflow access to one of the locals stored in that frame. For these >> 796: // preemption case we don't need to read these words anyways so we avoid it. > > Suggestion: > > // preemption cases we don't need to read these words anyways so we avoid it. Fixed. > src/hotspot/share/runtime/continuationFreezeThaw.cpp line 797: > >> 795: // an overflow access to one of the locals stored in that frame. For these >> 796: // preemption case we don't need to read these words anyways so we avoid it. >> 797: adjust -= _preempt ? frame::metadata_words_at_bottom : 0; > > Suggestion: > > if (_preempt) { > adjust = 0; > } Done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26660#issuecomment-3169053521 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2263820051 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2263818906 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2263819043 PR Review Comment: https://git.openjdk.org/jdk/pull/26660#discussion_r2263819286 From pchilanomate at openjdk.org Fri Aug 8 19:19:13 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 8 Aug 2025 19:19:13 GMT Subject: RFR: 8359222: [asan] jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest triggers stack-buffer-overflow error [v2] In-Reply-To: <2QBp85PQqypHeK4V4VbA1lZH9W3vRj_0wPfnX-Kec5c=.80f47cfc-2bd0-4c2b-babe-d0e30ea6a6b1@github.com> References: <2QBp85PQqypHeK4V4VbA1lZH9W3vRj_0wPfnX-Kec5c=.80f47cfc-2bd0-4c2b-babe-d0e30ea6a6b1@github.com> Message-ID: On Fri, 8 Aug 2025 19:12:19 GMT, Patricio Chilano Mateo wrote: >> Please review the following small change. On linux-aarch64, ASan reports an overflow error on a read that occurs when frames are copied from the stack to the heap?specifically during preemption on monitorenter and Object.wait when coming from compiled code. This read is benign and comes from reading either all (freeze fast path case) or part (freeze slow path case) of the `frame::metadata_words_at_bottom` stored in the callee frame (return pc+fp) for the top frame. In these preemption cases the callee frame is the frame of the VM native method being called (e.g. `Runtime1::monitorenter`). Now, the Arm 64-bit ABI doesn't specify a location where the frame record (returnpc+fp) has to be stored within a stack frame and GCC currently chooses to save it at the top of the frame (lowest address). So this causes ASan to treat the memory access in the callee as an overflow access to one of the locals stored in that frame. >> >> Accessing the callee to read `frame::metadata_words_at_bottom` is only necessary when the top frame is compiled. In that case, the callee is `Continuation.doYield` and those words contain the return pc and fp of the top frame we are freezing, where possibly fp might contain an oop. For the preemption cases mentioned above though, the return pc used for the top frame is always the one from the anchor, not the stack. Also, there is never a callee saved value in fp that needs to be preserved, the fp is always constructed again when thawing the frame. But because of sharing the code path with the compiled frame case we also read these words. I?ve added the full details on where these memory accesses occur in the freeze code, both slow and fast paths, in the bug comments. >> >> The fix is to adjust these shared paths to avoid the read in the preemption case. I was able to reproduce the issue and have verified it is now fixed. I also ran the patch through mach5 tiers1-7. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > address David's comments Thanks for the review David! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26660#issuecomment-3169060162 From dlong at openjdk.org Fri Aug 8 20:15:15 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 8 Aug 2025 20:15:15 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 7 Aug 2025 15:49:20 GMT, Ruben wrote: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. This looks good. How much testing have you done? Maybe we can get rid of CodeOffsets::DeoptMH next. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3169173960 From duke at openjdk.org Fri Aug 8 21:43:23 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 21:43:23 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v9] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: distinct ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/71950aed..3116b39c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Fri Aug 8 21:43:24 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 21:43:24 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: <57LBOn04BwPwiJdf_UUqgud4h9qhj38ga2bT56akoIc=.6dcf1641-1d82-46a6-b157-1fb7a33b2342@github.com> References: <57LBOn04BwPwiJdf_UUqgud4h9qhj38ga2bT56akoIc=.6dcf1641-1d82-46a6-b157-1fb7a33b2342@github.com> Message-ID: On Fri, 8 Aug 2025 16:28:15 GMT, Quan Anh Mai wrote: >> `distinct()` has some overhead, do we need it? Do we expect duplicate headers in a file? > > It's not like stream is the most performant method to do anything anyway. I think the clarification is worth the additional overhead, especially when this file is not run very frequently. 3116b39c5dc812c1bfb0994bda3a82c9f66870af ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2264028292 From duke at openjdk.org Fri Aug 8 22:11:32 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 22:11:32 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: > In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. > > These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. > > > linux-x64 GCC > master real 81.39 user 3352.15 sys 287.49 > JDK-8365053 real 81.94 user 3030.24 sys 295.82 > > linux-x64 Clang > master real 43.44 user 2082.93 sys 130.70 > JDK-8365053 real 38.44 user 1723.80 sys 117.68 > > linux-aarch64 GCC > master real 1188.08 user 2015.22 sys 175.53 > JDK-8365053 real 1019.85 user 1667.45 sys 171.86 > > linux-aarch64 clang > master real 981.77 user 1645.05 sys 118.60 > JDK-8365053 real 791.96 user 1262.92 sys 101.50 Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: - magic number: 400 - inline ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26681/files - new: https://git.openjdk.org/jdk/pull/26681/files/3116b39c..be25d341 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26681&range=08-09 Stats: 188 lines in 1 file changed: 0 ins; 176 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/26681.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26681/head:pull/26681 PR: https://git.openjdk.org/jdk/pull/26681 From duke at openjdk.org Fri Aug 8 22:17:12 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Fri, 8 Aug 2025 22:17:12 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v8] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 17:31:41 GMT, Erik Joelsson wrote: > The latest version fails for me. I'm guessing because we exclude shenandoah by default for OracleJDK builds so having those headers in precompiled.hpp messes things up. I think we need to be careful with headers that are part of optional features. The last commit (be25d3413b432b56e9789eae55920f1862008911) should fix the problem. I made a build with the following configuration: ./configure \ --with-toolchain-type=clang \ --disable-jvm-feature-shenandoahgc \ --disable-jvm-feature-zero \ --disable-jvm-feature-zgc \ --disable-jvm-feature-g1gc \ --disable-jvm-feature-parallelgc \ --disable-jvm-feature-epsilongc and then I ran the script on this build to extract the [inclusions count](https://github.com/user-attachments/files/21692915/inclusions_count.txt). This could be included in the README file attached to this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3169427612 From sangheki at openjdk.org Fri Aug 8 22:36:11 2025 From: sangheki at openjdk.org (Sangheon Kim) Date: Fri, 8 Aug 2025 22:36:11 GMT Subject: RFR: 8364767: G1: Remove use of CollectedHeap::_soft_ref_policy In-Reply-To: References: Message-ID: <9HFVLrbzrIH4ffJz2q7YFwn7qGp008aA0b74sRnyMTM=.fd78fb39-dd8b-478e-be91-610e519870cd@github.com> On Tue, 5 Aug 2025 18:18:34 GMT, Albert Mingkun Yang wrote: > Use gc-cause checking in `VM_G1CollectFull` to decide soft-ref policy. > > Test: tier1-3 LGTM ------------- Marked as reviewed by sangheki (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26648#pullrequestreview-3102265382 From dlong at openjdk.org Sat Aug 9 00:39:16 2025 From: dlong at openjdk.org (Dean Long) Date: Sat, 9 Aug 2025 00:39:16 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 7 Aug 2025 15:49:20 GMT, Ruben wrote: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Test results from Oracle are good. You need one more review. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26678#pullrequestreview-3102350248 From dholmes at openjdk.org Sat Aug 9 02:00:19 2025 From: dholmes at openjdk.org (David Holmes) Date: Sat, 9 Aug 2025 02:00:19 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v2] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> <1hxKXbEwRFeYpwizPVqgc7A9RgisMENdhCkF6j3GpQg=.a41c0c87-3e8c-4404-a56d-5d10910f8e8a@github.com> <4kEHCplZQoA4U4JCTKpGCG8zVv3PU_8oi9YrjmKySjA=.ed88ff2d-1cdc-4332-a5d9-d6f48b89ed61@github.com> <4P-UnGAY8t-sXYcveukioiJ1rgwEhmCtJYSL7ofNevQ=.1ef67992-329a-4f7c-a0e4-d79ef6d62d8f@github.com> <25VDLj8CrrdVmydx-vou5S_-vxT_uqh5c0O VPubW7XM=.d515ad77-1b2c-4d26-ab78-554e3fe7b940@github.com> <_ESoV2tghLGuqSsmR8jYrdpoCpNpipdxtq878TT3vIc=.c21e74c2-2044-4b92-96b6-72e873302091@github.com> Message-ID: <3T7EA9zLJpHjYD6Y2bwjQu82BYrT_AU7ez9bYCAwkio=.6a7fb6e2-caed-4bd1-b271-6882ac71cd60@github.com> On Fri, 8 Aug 2025 13:32:50 GMT, Aleksey Shipilev wrote: >> It is not the "atomic" that is the issue it is the inappropriate and unnecessary use of acquire/release semantics. When people look at this code later they will wonder what it is that we are trying to coordinate - when the answer is "nothing". That just causes confusion. Synchronization and concurrency is hard enough without obfuscating things by sprinkling the wrong kind of synchronization constructs around "just to be safe". That isn't future-proofing things. If in the future we have different synchronization requirements then those requirements need to be understood and the correct code inserted. Maybe in the future we will need a mutex - who knows. > > You say "inappropriate and unnecessary", I say "conservative". I think this is an opinion stalemate, so I'll just recluse myself from this conversation, and let someone else to break this tie. > > There is no confusion to me: we pass something between the threads without clear additional synchronization in sight (i.e. mutexes) => we do `Atomic`-s. Deciding on the memory ordering for Atomics is: unless we see the need for performance, we default to conservative (acqrel and minimum for pointers, seqcst if we are are paranoid and cannot guarantee it is one-sided transfer) ordering, to avoid future accidents. > > If anything, a naked `fence()` raises much more questions for me in comparison with `Atomic` accesses. Mostly because fences are significantly more low-level than Atomics. When I look at this code from the perspective of bystander, these are the questions that pop into my mind: Why it is only the fence on the write side? Shouldn't there be a fence on the reader side somewhere then? Maybe we are optimizing performance? Are we relying on some other (partial) synchronization? Are we stacking with some other fence? Are there arch-specific details about what fences actually do? Etc. For completeness our own conventions say we should use `Atomic::load` and `Atomic::store` to highlight the lock-free access. But only a fence provides the guarantee of visibility. acquire/release does not. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26656#discussion_r2264193909 From aph at openjdk.org Sat Aug 9 19:03:10 2025 From: aph at openjdk.org (Andrew Haley) Date: Sat, 9 Aug 2025 19:03:10 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 5 Aug 2025 10:30:13 GMT, Fei Gao wrote: > In the existing implementation, the static call stub typically emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly sequence: > > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > > The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. > > While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. > > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > > All tests in Tier1 to Tier3, under both release and debug builds, have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads Try this. It might be enough to rescue this PR. diff --git a/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.hpp b/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.hpp index 12b941fc4f7..29853ed4a10 100644 --- a/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.hpp +++ b/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.hpp @@ -68,8 +68,8 @@ friend class ArrayCopyStub; enum { // call stub: CompiledDirectCall::to_interp_stub_size() + - // CompiledDirectCall::to_trampoline_stub_size() - _call_stub_size = 13 * NativeInstruction::instruction_size, + // CompiledDirectCall::to_trampoline_stub_size() + alignment nop + _call_stub_size = 15 * NativeInstruction::instruction_size, _exception_handler_size = DEBUG_ONLY(1*K) NOT_DEBUG(175), _deopt_handler_size = 7 * NativeInstruction::instruction_size }; diff --git a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp index 4285524514b..63577a55809 100644 --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp @@ -983,22 +983,29 @@ void MacroAssembler::emit_static_call_stub() { // Jump to the entry point of the c2i stub. const int stub_start_offset = offset(); Label far_jump_metadata, far_jump_entry; - ldr(rmethod, far_jump_metadata); + { + Label retry; bind(retry); + adr(rmethod, far_jump_metadata); + ldar(rmethod, (rmethod)); + cbz(rmethod, retry); + } ldr(rscratch1, far_jump_entry); br(rscratch1); + nop(); bind(far_jump_metadata); - assert(offset() - stub_start_offset == NativeStaticCallStub::far_jump_metadata_offset, + assert(offset() - stub_start_offset == NativeStaticCallStub::metadata_offset, "should be"); + assert(is_aligned(offset(), BytesPerWord), "offset is misaligned"); emit_int64(0); bind(far_jump_entry); - assert(offset() - stub_start_offset == NativeStaticCallStub::far_jump_entrypoint_offset, + assert(offset() - stub_start_offset == NativeStaticCallStub::entrypoint_offset, "should be"); emit_int64(0); } int MacroAssembler::static_call_stub_size() { - // ldr; ldr; br; zero; zero; zero; zero; - return 7 * NativeInstruction::instruction_size; + // adr; ldar; cbz; ldr; br; nop; zero; zero; zero; zero; + return 10 * NativeInstruction::instruction_size; } void MacroAssembler::c2bool(Register x) { diff --git a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp index 12a6eb6f7f0..5d33d94f1ae 100644 --- a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp @@ -393,22 +393,20 @@ void NativeCall::trampoline_jump(CodeBuffer &cbuf, address dest, JVMCI_TRAPS) { #ifndef PRODUCT // Mirror the logic in CompiledDirectCall::verify_mt_safe(). void NativeStaticCallStub::verify_static_stub(const methodHandle& callee, address entry) { - intptr_t metadata = intptr_at(far_jump_metadata_offset); - address entrypoint = ptr_at(far_jump_entrypoint_offset); + intptr_t metadata = intptr_at(metadata_offset); + address entrypoint = ptr_at(entrypoint_offset); CompiledDirectCall::verify_mt_safe_helper(callee, entry, metadata, entrypoint); } #endif void NativeStaticCallStub::set_metadata_and_destination(intptr_t callee, address entry) { - set_intptr_at(far_jump_metadata_offset, callee); - set_ptr_at(far_jump_entrypoint_offset, entry); - OrderAccess::release(); + set_ptr_at(entrypoint_offset, entry); + Atomic::release_store((intptr_t*)addr_at(metadata_offset), callee); } void NativeStaticCallStub::verify_instruction_sequence() { - if (! (nativeInstruction_at(addr_at(0))->is_ldr_literal() && - nativeInstruction_at(addr_at(NativeInstruction::instruction_size))->is_ldr_literal() && - nativeInstruction_at(addr_at(NativeInstruction::instruction_size * 2))->is_blr())) { + if (! (nativeInstruction_at(addr_at(NativeInstruction::instruction_size * 3))->is_ldr_literal() && + nativeInstruction_at(addr_at(NativeInstruction::instruction_size * 4))->is_blr())) { fatal("Not expected instructions in static call stub"); } } diff --git a/src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp b/src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp index 17c87bf7c2b..897152f26f1 100644 --- a/src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp +++ b/src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp @@ -461,8 +461,8 @@ class NativeStaticCallStub : public NativeInstruction { public: enum AArch64_specific_constants { - far_jump_metadata_offset = 3 * 4, - far_jump_entrypoint_offset = 5 * 4 + metadata_offset = 6 * NativeInstruction::instruction_size, + entrypoint_offset = 6 * NativeInstruction::instruction_size + wordSize , }; void set_metadata_and_destination(intptr_t callee, address entry); ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3172040556 From aph at openjdk.org Sat Aug 9 19:15:10 2025 From: aph at openjdk.org (Andrew Haley) Date: Sat, 9 Aug 2025 19:15:10 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 5 Aug 2025 10:30:13 GMT, Fei Gao wrote: > In the existing implementation, the static call stub typically emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly sequence: > > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > > The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. > > While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. > > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > > All tests in Tier1 to Tier3, under both release and debug builds, have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads There may still be a race in `set_to_clean()`, which doesn't zero out the [method, entry} fields in the stub. I'd fix that, to be safe. It'd be tricky to test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3172045863 From lmesnik at openjdk.org Sat Aug 9 20:36:51 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sat, 9 Aug 2025 20:36:51 GMT Subject: RFR: 8365192: post_meth_exit should be in vm state when calling get_jvmti_thread_state Message-ID: The method get_jvmti_thread_state() should be called only while thread is in vm state. The post_method_exit is doing some preparation before switching to vm state. This cause issues if thread is needed to initialize jvmti thread state. The fix was found using jvmti stress agent and thus no additional regression test is required. ------------- Commit messages: - 8365192: post_meth_exit should be in vm state when calling get_jvmti_thread_state Changes: https://git.openjdk.org/jdk/pull/26713/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26713&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365192 Stats: 7 lines in 1 file changed: 5 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26713.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26713/head:pull/26713 PR: https://git.openjdk.org/jdk/pull/26713 From aph at openjdk.org Sun Aug 10 07:27:09 2025 From: aph at openjdk.org (Andrew Haley) Date: Sun, 10 Aug 2025 07:27:09 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Sat, 9 Aug 2025 19:12:14 GMT, Andrew Haley wrote: > There may still be a race in `set_to_clean()`, which doesn't zero out the [method, entry} fields in the stub. I'd fix that, to be safe. It'd be tricky to test. Ah no, that would create another race when unloading. This stuff is hard... ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3172432994 From kbarrett at openjdk.org Mon Aug 11 01:23:18 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 11 Aug 2025 01:23:18 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: <4hk98YWMw1Mf2CTBe2Yb0ZvGXDqoDl2MqMwKvfXbzPM=.09065989-dd8f-4bdc-a2f9-83b13d658173@github.com> <50Lh-a6pTujiDCyHwPIcC5zzgiu-zP-pAiQeyu4CJcs=.e74b0e74-b87f-45d4-b441-f2d3045fd101@github.com> <5kr1ZhhyDjDuI87uSmB-Xh0rzIHq2tgnVTZXWNdHqGs=.e33307ca-794d-4419-a2b5-7ec2f6dd4c95@github.com> Message-ID: On Fri, 8 Aug 2025 18:55:07 GMT, Daniel D. Daugherty wrote: >> Thanks for the clarification, I think that it is fine, then, although a comment explaining in the code would be great. > > I suspect that the multiple includes reported in the .d files are accurate. The .d file appears > to report the number of times we `#include` a file, however, because if the include-guard > constructs we use, the content is pulled in only once. The number of times a header is (directly or indirectly) included by a given source file doesn't affect the number of times that header appears in the .d file associated with that source file. The compiler uniquifies that set (though links might confuse things, but that's not an issue here). The .d files consist of a set of targets, and for each target, the set of dependencies. For the .d files, there's one target, .o, with its dependencies (once each). For BUILD_LIBJVM.d, there's a target for each .o file. And for each of those targets, it's dependencies (once each). Essentially BUILD_LIBJVM.d includes a concatenation of all the .d files. And since there are duplicates among the dependencies of different targets, there are many(!) duplicates in this file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2265529851 From kbarrett at openjdk.org Mon Aug 11 01:49:20 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 11 Aug 2025 01:49:20 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline > > The latest version fails for me. I'm guessing because we exclude shenandoah by default for OracleJDK builds so having those headers in precompiled.hpp messes things up. I think we need to be careful with headers that are part of optional features. > > The last commit ([be25d34](https://github.com/openjdk/jdk/commit/be25d3413b432b56e9789eae55920f1862008911)) should fix the problem. I made a build with the following configuration: > > ``` > ./configure \ > --with-toolchain-type=clang \ > --disable-jvm-feature-shenandoahgc \ > --disable-jvm-feature-zero \ > --disable-jvm-feature-zgc \ > --disable-jvm-feature-g1gc \ > --disable-jvm-feature-parallelgc \ > --disable-jvm-feature-epsilongc > ``` > > and then I ran the script on this build to extract the [inclusions count](https://github.com/user-attachments/files/21692915/inclusions_count.txt). This could be included in the README file attached to this PR. Other optional features include cds, c1, jfr, and jvmci, so files from those directories also ought to be excluded. The (current) list of optional features is here: https://github.com/openjdk/jdk/blame/022e29a77533aacabd56820d00ecffa9646a8362/make/autoconf/jvm-features.m4#L47-L49 cds compiler1 compiler2 dtrace epsilongc g1gc jfr jni-check \ jvmci jvmti link-time-opt management minimal opt-size parallelgc \ serialgc services shenandoahgc vm-structs zero zgc \ Not all of these have their own directories. Even where they do, the feature name and directory name aren't always the same. (The compiler1 feature and the c1 directory being an obvious example.) I didn't spot any files for other optional features in the latest list, other than the ones called out in the first sentence of this comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173072518 From xgong at openjdk.org Mon Aug 11 03:10:17 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 11 Aug 2025 03:10:17 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v2] In-Reply-To: References: Message-ID: <1Vs8Ud-yh7FtFJN9sddNXDVM6Mc0ue9oi_oa0w5pRzU=.022172f3-1622-4d05-888b-c7afc66a5254@github.com> On Fri, 25 Jul 2025 20:09:40 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Updating predicate checks Thanks for your work @jatin-bhateja! This PR also provides help on AArch64 that we also have plan to do the same intrinsifaction in our side. src/hotspot/share/opto/vectorIntrinsics.cpp line 1667: > 1665: bool LibraryCallKit::inline_vector_slice() { > 1666: const TypeInt* origin = gvn().type(argument(0))->isa_int(); > 1667: const TypeInstPtr* vector_klass = gvn().type(argument(1))->isa_instptr(); Code style: Suggestion: const TypeInt* origin = gvn().type(argument(0))->isa_int(); const TypeInstPtr* vector_klass = gvn().type(argument(1))->isa_instptr(); src/hotspot/share/opto/vectorIntrinsics.cpp line 1700: > 1698: > 1699: if (!arch_supports_vector(Op_VectorSlice, num_elem, elem_bt, VecMaskNotUsed)) { > 1700: log_if_needed(" ** not supported: arity=2 op=slice vlen=%d etype=%s ismask=useload/none", `ismask=useload/none` is not necessary here? src/hotspot/share/opto/vectorIntrinsics.cpp line 1714: > 1712: } > 1713: > 1714: Node* origin_node = gvn().intcon(origin->get_con() * type2aelembytes(elem_bt)); Q1: Is it possible that just passing `origin->get_con()` to `VectorSliceNode` in case there are architectures that need it directly? Or, maybe we'd better add comment telling that the origin passed to `VectorSliceNode` is adjust to bytes. Q2: If `origin` is not a constant, and there is an architecture that support the index as a variable, will the code crash here? Can we just limit the `origin` to a constant for this intrinsifaction in this PR? We can consider to extend it to variable in case any architecture has such requirement. WDYT? src/hotspot/share/opto/vectornode.hpp line 1719: > 1717: class VectorSliceNode : public VectorNode { > 1718: public: > 1719: VectorSliceNode(Node* vec1, Node* vec2, Node* origin, const TypeVect* vt) Do we have specific value for `origin` like zero or vlen? If so, maybe simply Identity is better to be added as well. ------------- PR Review: https://git.openjdk.org/jdk/pull/24104#pullrequestreview-3103877319 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2265568519 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2265573060 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2265579342 PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2265580768 From xgong at openjdk.org Mon Aug 11 03:14:12 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 11 Aug 2025 03:14:12 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v2] In-Reply-To: References: Message-ID: On Fri, 25 Jul 2025 20:09:40 GMT, Jatin Bhateja wrote: >> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. >> It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. >> >> Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). >> >> Vector API jtreg tests pass at AVX level 2, remaining validation in progress. >> >> Performance numbers: >> >> >> System : 13th Gen Intel(R) Core(TM) i3-1315U >> >> Baseline: >> Benchmark (size) Mode Cnt Score Error Units >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms >> VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms >> VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms >> VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms >> VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms >> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms >> VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms >> VectorSliceB... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Updating predicate checks test/micro/org/openjdk/bench/jdk/incubator/vector/VectorSliceBenchmark.java line 36: > 34: @State(Scope.Thread) > 35: @Fork(jvmArgs = {"--add-modules=jdk.incubator.vector"}) > 36: public class VectorSliceBenchmark { I remember that it has the micro benchmarks for slice/unslice under `test/micro/org/openjdk/bench/jdk/incubator/vector/operation` on panama-vector. Can we reuse those JMHs to check the benchmark improvement? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24104#discussion_r2265592754 From dholmes at openjdk.org Mon Aug 11 06:24:11 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Aug 2025 06:24:11 GMT Subject: RFR: 8359222: [asan] jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest triggers stack-buffer-overflow error [v2] In-Reply-To: <2QBp85PQqypHeK4V4VbA1lZH9W3vRj_0wPfnX-Kec5c=.80f47cfc-2bd0-4c2b-babe-d0e30ea6a6b1@github.com> References: <2QBp85PQqypHeK4V4VbA1lZH9W3vRj_0wPfnX-Kec5c=.80f47cfc-2bd0-4c2b-babe-d0e30ea6a6b1@github.com> Message-ID: On Fri, 8 Aug 2025 19:12:19 GMT, Patricio Chilano Mateo wrote: >> Please review the following small change. On linux-aarch64, ASan reports an overflow error on a read that occurs when frames are copied from the stack to the heap?specifically during preemption on monitorenter and Object.wait when coming from compiled code. This read is benign and comes from reading either all (freeze fast path case) or part (freeze slow path case) of the `frame::metadata_words_at_bottom` stored in the callee frame (return pc+fp) for the top frame. In these preemption cases the callee frame is the frame of the VM native method being called (e.g. `Runtime1::monitorenter`). Now, the Arm 64-bit ABI doesn't specify a location where the frame record (returnpc+fp) has to be stored within a stack frame and GCC currently chooses to save it at the top of the frame (lowest address). So this causes ASan to treat the memory access in the callee as an overflow access to one of the locals stored in that frame. >> >> Accessing the callee to read `frame::metadata_words_at_bottom` is only necessary when the top frame is compiled. In that case, the callee is `Continuation.doYield` and those words contain the return pc and fp of the top frame we are freezing, where possibly fp might contain an oop. For the preemption cases mentioned above though, the return pc used for the top frame is always the one from the anchor, not the stack. Also, there is never a callee saved value in fp that needs to be preserved, the fp is always constructed again when thawing the frame. But because of sharing the code path with the compiled frame case we also read these words. I?ve added the full details on where these memory accesses occur in the freeze code, both slow and fast paths, in the bug comments. >> >> The fix is to adjust these shared paths to avoid the read in the preemption case. I was able to reproduce the issue and have verified it is now fixed. I also ran the patch through mach5 tiers1-7. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > address David's comments Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26660#pullrequestreview-3104168492 From dholmes at openjdk.org Mon Aug 11 06:29:13 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Aug 2025 06:29:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: <-jY8BYznUPdJuRtoV-JSSDwa96qtMRN3BqOlms5Mkuk=.83b456e1-074b-462f-b621-9a2ea57495ad@github.com> On Mon, 11 Aug 2025 01:46:43 GMT, Kim Barrett wrote: > Other optional features include cds, c1, jfr, and jvmci, so files from those directories also ought to be excluded. Can't these just be conditionalized with an INCLUDE_xxx check instead of being excluded? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173410334 From jbhateja at openjdk.org Mon Aug 11 06:39:57 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 11 Aug 2025 06:39:57 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v3] In-Reply-To: References: Message-ID: > Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. > It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. > > Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). > > Vector API jtreg tests pass at AVX level 2, remaining validation in progress. > > Performance numbers: > > > System : 13th Gen Intel(R) Core(TM) i3-1315U > > Baseline: > Benchmark (size) Mode Cnt Score Error Units > VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms > VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms > VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms > VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms > VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms > VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms > VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms > VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms > VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms > VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms > VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms > VectorSliceBenchmark.shortVectorSliceWithVariableIndex 1024 ... Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8303762 - Updating predicate checks - Fixes for failing regressions - Optimizing AVX2 backend and some re-factoring - new benchmark - Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8303762 - 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction ------------- Changes: https://git.openjdk.org/jdk/pull/24104/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24104&range=02 Stats: 747 lines in 32 files changed: 664 ins; 0 del; 83 mod Patch: https://git.openjdk.org/jdk/pull/24104.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24104/head:pull/24104 PR: https://git.openjdk.org/jdk/pull/24104 From dfenacci at openjdk.org Mon Aug 11 07:31:12 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 11 Aug 2025 07:31:12 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 7 Aug 2025 15:49:20 GMT, Ruben wrote: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Thanks for "cleaning" this @ruben-arm. Did you run some testing (on the touched platforms)? I doubt that there is anything perceivable but, out of curiosity, did you notice any performance change? src/hotspot/cpu/s390/s390.ad line 1652: > 1650: public: > 1651: > 1652: static int emit_exception_handler(C2_MacroAssembler *masm); Is this declaration still needed? src/hotspot/share/code/nmethod.cpp line 1486: > 1484: } > 1485: > 1486: assert(offsets->value(CodeOffsets::Deopt ) != -1, "must be set"); It might be good to leave this line where it was at the beginning of the block, just to avoid one more diff. ------------- PR Review: https://git.openjdk.org/jdk/pull/26678#pullrequestreview-3104246623 PR Review Comment: https://git.openjdk.org/jdk/pull/26678#discussion_r2265844527 PR Review Comment: https://git.openjdk.org/jdk/pull/26678#discussion_r2265878970 From ihse at openjdk.org Mon Aug 11 08:02:20 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 08:02:20 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: <-U8KN5uNIUhrTn2MCESloqat2y6ZFvgxK-bsmqIDFFI=.ab50a73b-80bd-49ec-aac9-b633cc92b915@github.com> On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline There is a handy shortcut for creating a build with all optional features disabled, `--with-jvm-variants=custom`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173648190 From kbarrett at openjdk.org Mon Aug 11 08:02:20 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 11 Aug 2025 08:02:20 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: <-jY8BYznUPdJuRtoV-JSSDwa96qtMRN3BqOlms5Mkuk=.83b456e1-074b-462f-b621-9a2ea57495ad@github.com> References: <-jY8BYznUPdJuRtoV-JSSDwa96qtMRN3BqOlms5Mkuk=.83b456e1-074b-462f-b621-9a2ea57495ad@github.com> Message-ID: On Mon, 11 Aug 2025 06:26:04 GMT, David Holmes wrote: > > Other optional features include cds, c1, jfr, and jvmci, so files from those directories also ought to be excluded. > > Can't these just be conditionalized with an INCLUDE_xxx check instead of being excluded? Yeah, that might be better than excluding altogether. But still need to figure out which files to (conditionally) exclude. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173654009 From ihse at openjdk.org Mon Aug 11 08:02:21 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 08:02:21 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 06:32:57 GMT, Kim Barrett wrote: >> Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: >> >> - magic number: 400 >> - inline > > src/hotspot/share/precompiled/precompiled.hpp line 70: > >> 68: #include "utilities/ticks.hpp" >> 69: >> 70: #ifdef TARGET_COMPILER_visCPP > > All of the reported testing was on Linux. These were included specifically because measurements > said their inclusion here was beneficial. And my recollection from previous discussions is that > Visual Studio may be the compiler where precompiled headers are most beneficial. Indeed, Visual Studio is the place where PCH is most needed. I see Erik says he tested on Windows with no difference. While he concluded that this means no regression, I see it as a missed opportunity. Giving the Windows platform a bit of extra love can probably increase compilation speed where it is needed the most. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2265948776 From ihse at openjdk.org Mon Aug 11 08:09:15 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 08:09:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency In-Reply-To: References: Message-ID: On Thu, 7 Aug 2025 19:55:35 GMT, Francesco Andreuzzi wrote: > Also, maybe check in the generation script as well? I think a subfolder here would be fine: https://github.com/openjdk/jdk/tree/master/src/utils. @shipilev I'm not sure this is a proper place. The other tools in `src/utils` are stand-alone utilities that could be useful to end users, like hsdis and IdealGraphVisualizer. In contrast, tools that are useful for developers are put either into `bin` or `make/scripts`. Normally, I think `bin` is what would make most sense, and the latter is mostly a legacy result of people using `make` as a `misc` back in the bad old hg-forest days, but in this particular case I think it might be the best placement. This is after all clearly about optimizing the build speed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173680429 From ihse at openjdk.org Mon Aug 11 08:13:12 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 08:13:12 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: <-jY8BYznUPdJuRtoV-JSSDwa96qtMRN3BqOlms5Mkuk=.83b456e1-074b-462f-b621-9a2ea57495ad@github.com> Message-ID: On Mon, 11 Aug 2025 07:59:02 GMT, Kim Barrett wrote: > But still need to figure out which files to (conditionally) exclude. Maybe first make a baseline build with no optional features, and analyze the result with this tool. And then enable each individual feature, re-build and re-analyze, and see how it differs. If a file then makes it to the PCH list, it is useful when enabling that feature, and it is potentially only correct to include with that feature enabled, so it could be added with an include guard. That will require a bit more manual work (unless this process too can be automated), but it will probably give the best results. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173692668 From duke at openjdk.org Mon Aug 11 08:20:15 2025 From: duke at openjdk.org (Ruben) Date: Mon, 11 Aug 2025 08:20:15 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Mon, 11 Aug 2025 06:57:52 GMT, Damon Fenacci wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > src/hotspot/cpu/s390/s390.ad line 1652: > >> 1650: public: >> 1651: >> 1652: static int emit_exception_handler(C2_MacroAssembler *masm); > > Is this declaration still needed? No, it should be removed as well - I will update the patch. > src/hotspot/share/code/nmethod.cpp line 1486: > >> 1484: } >> 1485: >> 1486: assert(offsets->value(CodeOffsets::Deopt ) != -1, "must be set"); > > It might be good to leave this line where it was at the beginning of the block, just to avoid one more diff. Sure, I will update this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26678#discussion_r2265989794 PR Review Comment: https://git.openjdk.org/jdk/pull/26678#discussion_r2265989898 From ihse at openjdk.org Mon Aug 11 08:21:22 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 08:21:22 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v3] In-Reply-To: References: <7fGzfcVu-qJat5d53jOHN6hJmLKT26IKA_Q6Z9LySVc=.8225de02-89b8-4523-9aaf-9a402a2d7730@github.com> Message-ID: On Fri, 8 Aug 2025 15:31:18 GMT, Francesco Andreuzzi wrote: >> You need to modify the comment here, too. > > Right, I'll wait just a bit so the discussion on how to approach the problem stabilizes :) This number is based on my original experimentation. If I tried to lower the bar by lowering the number, the PCH list grew too much and it made for a worse performance. And contrary, if I raised the number, fewer files where included which made the PCH quicker to process but less helpful. That number was the optimum I found. It seems from https://bugs.openjdk.org/browse/JDK-8365053 that this is still around the optimum. However, rather than just mentioning the number in the comment, the rationale could be specified, like `the optimum number of includes`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26681#discussion_r2265991618 From duke at openjdk.org Mon Aug 11 08:41:12 2025 From: duke at openjdk.org (Ruben) Date: Mon, 11 Aug 2025 08:41:12 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: <0JLBGmyrRt-y1IHz5j-UYkiXutQ1FgrOqeO0gQTFhcs=.fb1ff8ce-67d9-4963-9adb-758e17b66f98@github.com> On Fri, 8 Aug 2025 20:12:08 GMT, Dean Long wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > This looks good. How much testing have you done? > > Maybe we can get rid of CodeOffsets::DeoptMH next. Thank you for the reviews, @dean-long, @dafedafe, > How much testing have you done? > Did you run some testing (on the touched platforms)? I've run tier1-tier3 tests, however only on AArch64 and x86-64. I can run more tests on these platforms if that might be useful. > I doubt that there is anything perceivable but, out of curiosity, did you notice any performance change? I've not measured performance impact of the patch separately. It is a part of a bigger effort to reduce memory footprint of compiled code. The cumulative effect from this and similar patches is expected to be a noticeable performance improvement, however I wouldn't expect a significant observable effect from this patch only. The decrease for C2-compiled code on AArch64 should be 4-12 bytes (depends on size of code cache) per nmethod, however the nmethod's alignment requirement might hide some of these improvements. I'm also exploring possibility to reduce footprint of the deoptimization handler stub codes. > Maybe we can get rid of CodeOffsets::DeoptMH next. I will look into this. I have a proof-of-concept patch reducing each deoptimization handler stub code to 1 instruction each on AArch64 - that instruction traps and the rest of the deoptimization handler logic continues in signal handler. However, I would agree it would be preferable to remove the stub code completely if possible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3173773194 From duke at openjdk.org Mon Aug 11 08:59:16 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Mon, 11 Aug 2025 08:59:16 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: <-jY8BYznUPdJuRtoV-JSSDwa96qtMRN3BqOlms5Mkuk=.83b456e1-074b-462f-b621-9a2ea57495ad@github.com> Message-ID: <0DhdCTNjwi_HGzvp9ybEEikGSHvqSvWTBP4PIIbdzEI=.7bbfa37f-be55-4925-9cc9-40850ba1f54c@github.com> On Mon, 11 Aug 2025 08:10:09 GMT, Magnus Ihse Bursie wrote: > > But still need to figure out which files to (conditionally) exclude. > > Maybe first make a baseline build with no optional features, and analyze the result with this tool. And then enable each individual feature, re-build and re-analyze, and see how it differs. If a file then makes it to the PCH list, it is useful when enabling that feature, and it is potentially only correct to include with that feature enabled, so it could be added with an include guard. > > That will require a bit more manual work (unless this process too can be automated), but it will probably give the best results. Hi @magicus, thanks for the hint. I'll give it a shot and post the results here! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3173824367 From ihse at openjdk.org Mon Aug 11 09:39:21 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 09:39:21 GMT Subject: RFR: 8361950: Update to use jtreg 8 In-Reply-To: References: Message-ID: On Fri, 11 Jul 2025 09:05:40 GMT, Christian Stein wrote: > Please review the change to update to using jtreg 8. > > The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the requiredVersion has been updated in the various `TEST.ROOT` files. Marked as reviewed by ihse (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26261#pullrequestreview-3104868381 From ayang at openjdk.org Mon Aug 11 09:41:15 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 11 Aug 2025 09:41:15 GMT Subject: RFR: 8364767: G1: Remove use of CollectedHeap::_soft_ref_policy In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 18:18:34 GMT, Albert Mingkun Yang wrote: > Use gc-cause checking in `VM_G1CollectFull` to decide soft-ref policy. > > Test: tier1-3 Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26648#issuecomment-3173964461 From ayang at openjdk.org Mon Aug 11 09:45:26 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 11 Aug 2025 09:45:26 GMT Subject: Integrated: 8364767: G1: Remove use of CollectedHeap::_soft_ref_policy In-Reply-To: References: Message-ID: On Tue, 5 Aug 2025 18:18:34 GMT, Albert Mingkun Yang wrote: > Use gc-cause checking in `VM_G1CollectFull` to decide soft-ref policy. > > Test: tier1-3 This pull request has now been integrated. Changeset: 0c39228e Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/0c39228ec1c8c6eadafb54567c94ad5f19f27f7a Stats: 42 lines in 6 files changed: 4 ins; 35 del; 3 mod 8364767: G1: Remove use of CollectedHeap::_soft_ref_policy Reviewed-by: tschatzl, sangheki ------------- PR: https://git.openjdk.org/jdk/pull/26648 From ayang at openjdk.org Mon Aug 11 10:57:10 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 11 Aug 2025 10:57:10 GMT Subject: RFR: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree [v2] In-Reply-To: References: Message-ID: <_64fHkIUSnCgZRdwphFKm6LqfUaSv_NKzd0-ivH3nEw=.4fc83148-7758-4ef7-969c-816aa0750a92@github.com> On Wed, 6 Aug 2025 11:07:42 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter time complexity, the treap can be removed in favour of it. >> >> I made some modifications to the red-black tree to make it compatible with previous treap usages: >> - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. >> - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. >> >> Changes to NMT include: >> - Modified components to align with the updated const-correctness of the red-black tree functions >> - Renamed structures and variables to remove "treap" from their names to reflect the new tree >> >> The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. >> >> Testing: >> - Oracle tiers 1-3 > > Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: > > feedback fixes Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26655#pullrequestreview-3105279452 From adinn at openjdk.org Mon Aug 11 11:08:11 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 11 Aug 2025 11:08:11 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 7 Aug 2025 15:49:20 GMT, Ruben wrote: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Looks ok to me. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26678#pullrequestreview-3105309380 From redestad at openjdk.org Mon Aug 11 11:14:20 2025 From: redestad at openjdk.org (Claes Redestad) Date: Mon, 11 Aug 2025 11:14:20 GMT Subject: RFR: 8361842: Move input validation checks to Java for java.lang.StringCoding intrinsics [v13] In-Reply-To: References: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> Message-ID: <7vrt6sWpiQUWUchiSTMUByRbe8q1Tcd_VcfkJP4MaB0=.3ee3b5f9-7188-446b-bd9a-12233fda0488@github.com> On Fri, 25 Jul 2025 07:40:46 GMT, Volkan Yazici wrote: >> Validate input in `java.lang.StringCoding` intrinsic Java wrappers, improve their documentation, enhance the checks in the associated IR or assembly code, and adapt them to cause VM crash on invalid input. >> >> ## Implementation notes >> >> The goal of the associated umbrella issue [JDK-8156534](https://bugs.openjdk.org/browse/JDK-8156534) is to, for `java.lang.String*` classes, >> >> 1. Move `@IntrinsicCandidate`-annotated `public` methods1 (in Java code) to `private` ones, and wrap them with a `public` ["front door" method](https://github.com/openjdk/jdk/pull/24982#discussion_r2087493446) >> 2. Since we moved the `@IntrinsicCandidate` annotation to a new method, intrinsic mappings ? i.e., associated `do_intrinsic()` calls in `vmIntrinsics.hpp` ? need to be updated too >> 3. Add necessary input validation (range, null, etc.) checks to the newly created public front door method >> 4. Place all input validation checks in the intrinsic code (add if missing!) behind a `VerifyIntrinsicChecks` VM flag >> >> Following preliminary work needs to be carried out as well: >> >> 1. Add a new `VerifyIntrinsicChecks` VM flag >> 2. Update `generate_string_range_check` to produce a `HaltNode`. That is, crash the VM if `VerifyIntrinsicChecks` is set and a Java wrapper fails to spot an invalid input. >> >> 1 `@IntrinsicCandidate`-annotated constructors are not subject to this change, since they are a special case. >> >> ## Functional and performance tests >> >> - `tier1` (which includes `test/hotspot/jtreg/compiler/intrinsics/string`) passes on several platforms. Further tiers will be executed after integrating reviewer feedback. >> >> - Performance impact is still actively monitored using `test/micro/org/openjdk/bench/java/lang/String{En,De}code.java`, among other tests. If you have suggestions on benchmarks, please share in the comments. >> >> ## Verification of the VM crash >> >> I've tested the VM crash scenario as follows: >> >> 1. Created the following test program: >> >> public class StrIntri { >> public static void main(String[] args) { >> Exception lastException = null; >> for (int i = 0; i < 1_000_000; i++) { >> try { >> jdk.internal.access.SharedSecrets.getJavaLangAccess().countPositives(new byte[]{1,2,3}, 2, 5); >> } catch (Exception exception) { >> lastException = exception; >> } >> } >> if (lastException != null) { >> lastException.printStackTrace... > > Volkan Yazici has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into strIntrinCheck > - Replace `requireNonNull` with implicit null checks to reduce bytecode size > - Add `@bug` tags > - Improve wording of `@param len` > - Make source array bound checks lenient too > - Cap destination array bounds > - Fix bit shifting > - Remove superseded `@throws` Javadoc > - Merge remote-tracking branch 'upstream/master' into strIntrinCheck > - Make `StringCoding` encoding intrinsics lenient > - ... and 21 more: https://git.openjdk.org/jdk/compare/9a1823bd...c322f0e0 src/hotspot/share/opto/library_call.cpp line 964: > 962: > 963: if (bailout->req() > 1) { > 964: bailout = _gvn.transform(bailout)->as_Region(); Should this be done only for the case where it's needed, i.e., in the `if (halt)` branch? Not sure it has any side-effect but it seems prudent not to change the default code more than necessary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25998#discussion_r2266374676 From cstein at openjdk.org Mon Aug 11 11:35:12 2025 From: cstein at openjdk.org (Christian Stein) Date: Mon, 11 Aug 2025 11:35:12 GMT Subject: RFR: 8361950: Update to use jtreg 8 [v2] In-Reply-To: References: Message-ID: > Please review the change to update to using jtreg 8. > > The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the requiredVersion has been updated in the various `TEST.ROOT` files. Christian Stein has updated the pull request incrementally with one additional commit since the last revision: Update to use correct library directories See also: https://openjdk.org/jtreg/faq.html#how-do-i-find-the-path-for-the-testng-or-junit-jar-files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26261/files - new: https://git.openjdk.org/jdk/pull/26261/files/bf7b033b..b5112c32 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26261&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26261&range=00-01 Stats: 30 lines in 2 files changed: 8 ins; 19 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26261.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26261/head:pull/26261 PR: https://git.openjdk.org/jdk/pull/26261 From cstein at openjdk.org Mon Aug 11 11:35:13 2025 From: cstein at openjdk.org (Christian Stein) Date: Mon, 11 Aug 2025 11:35:13 GMT Subject: RFR: 8361950: Update to use jtreg 8 In-Reply-To: References: Message-ID: <45gc5UsWc5d6aePWF_7SSX0ERJTTvklTXBukRdjqhCc=.1cf0046e-b697-4fce-9751-e8eade48a49e@github.com> On Fri, 11 Jul 2025 09:05:40 GMT, Christian Stein wrote: > Please review the change to update to using jtreg 8. > > The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the requiredVersion has been updated in the various `TEST.ROOT` files. I updated both tests to be agnostic about the location to where `jtreg` puts library classes into. See also: https://openjdk.org/jtreg/faq.html#how-do-i-find-the-path-for-the-testng-or-junit-jar-files ------------- PR Comment: https://git.openjdk.org/jdk/pull/26261#issuecomment-3174373256 From ayang at openjdk.org Mon Aug 11 12:01:12 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 11 Aug 2025 12:01:12 GMT Subject: RFR: 8364819: Post-integration cleanups for JDK-8359820 [v5] In-Reply-To: References: <52HWTyql-JmgGdaGWSLOifQ681Z-U6wNvUp1Nxk07Rs=.eabdb8c2-47f0-46f6-b0e1-e3b67dfc40f1@github.com> Message-ID: On Thu, 7 Aug 2025 07:34:50 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> This is a cleanup after https://github.com/openjdk/jdk/pull/26309 >> >> 1) `_handshake_timed_out_thread `and `_safepoint_timed_out_thread` are now `Thread*` and not `intptr_t`, no conversions `p2i <-> i2p` needed. >> >> 2) Added a missed brace in the error message. >> >> Trivial change. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8364819: Addressed reviewer's comments > I should have picked this up previously but volatile generally serves no purpose for inter-thread consistency. True, but I thought we denote a variable `volatile` if it can be accessed by multiple threads, mostly for documentation purpose. In this case, it should be e.g. `Thread* volatile VMError::_handshake_timed_out_thread`. > (I note that pthread_kill is not specified as a memory synchronizing operation, but I strongly suspect it has to have its own internal memory barriers that we would be piggy-backing on.) How about making the ordering explicit by using `OrderAccess::acquire/release` in reader/writer side? Sth like: Writer side: VMError::set_handshake_timed_out_thread(target); OrderAccess::release(); if (os::signal_thread(target, SIGILL, "cannot be handshaked")) { Reader side: if (_siginfo != nullptr && os::signal_sent_by_kill(_siginfo)) { OrderAccess::acquire(); if (get_handshake_timed_out_thread() == _thread) { > Here we have a very simple situation: > Thread 1: set field; signal target thread > Target thread: receive signal; read field I think for this to work, we need ordering btw set-field and sending-signal and btw reading-signal and read-field. Therefore, the acquire/release "operation" should be performed on the "signal" variable. (Ofc, such "signal" variable is not directly accessible, hence, the suggested `OrderAccess` above.) I have to say it's indeed a bit odd that "setters" (e.g. `set_handshake_timed_out_thread`) have `fence()` there for no obvious reasons. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26656#issuecomment-3174462135 From cnorrbin at openjdk.org Mon Aug 11 12:26:18 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Mon, 11 Aug 2025 12:26:18 GMT Subject: RFR: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree [v2] In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 11:07:42 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter time complexity, the treap can be removed in favour of it. >> >> I made some modifications to the red-black tree to make it compatible with previous treap usages: >> - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. >> - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. >> >> Changes to NMT include: >> - Modified components to align with the updated const-correctness of the red-black tree functions >> - Renamed structures and variables to remove "treap" from their names to reflect the new tree >> >> The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. >> >> Testing: >> - Oracle tiers 1-3 > > Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: > > feedback fixes Thank you for the reviews! Let's ship it :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/26655#issuecomment-3174530098 From cnorrbin at openjdk.org Mon Aug 11 12:26:19 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Mon, 11 Aug 2025 12:26:19 GMT Subject: Integrated: 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree In-Reply-To: References: Message-ID: On Wed, 6 Aug 2025 09:29:05 GMT, Casper Norrbin wrote: > Hi everyone, > > The utilities red-black tree and the NMT treap serve similar functions. Given the red-black tree's versatility and stricter time complexity, the treap can be removed in favour of it. > > I made some modifications to the red-black tree to make it compatible with previous treap usages: > - Updated the `visit_in_order` and `visit_range_in_order` functions to require the supplied callback to return a bool, which allows us to stop traversing early. > - Improved const-correctness by ensuring that invoking these functions on a const reference provides const pointers to nodes, while non-const references provide mutable pointers. Previously the two functions behaved differently. > > Changes to NMT include: > - Modified components to align with the updated const-correctness of the red-black tree functions > - Renamed structures and variables to remove "treap" from their names to reflect the new tree > > The treap was also used in one place in C2. I changed this to use the red-black tree and its cursor interface, which I felt was most fitting for the use case. > > Testing: > - Oracle tiers 1-3 This pull request has now been integrated. Changeset: 0ad919c1 Author: Casper Norrbin URL: https://git.openjdk.org/jdk/commit/0ad919c1e54895b000b58f6a1b54d79f76970845 Stats: 1016 lines in 14 files changed: 124 ins; 816 del; 76 mod 8352067: Remove the NMT treap and replace its uses with the utilities red-black tree Reviewed-by: jsjolen, ayang ------------- PR: https://git.openjdk.org/jdk/pull/26655 From redestad at openjdk.org Mon Aug 11 12:46:19 2025 From: redestad at openjdk.org (Claes Redestad) Date: Mon, 11 Aug 2025 12:46:19 GMT Subject: RFR: 8361842: Move input validation checks to Java for java.lang.StringCoding intrinsics [v13] In-Reply-To: References: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> Message-ID: On Fri, 25 Jul 2025 07:40:46 GMT, Volkan Yazici wrote: >> Validate input in `java.lang.StringCoding` intrinsic Java wrappers, improve their documentation, enhance the checks in the associated IR or assembly code, and adapt them to cause VM crash on invalid input. >> >> ## Implementation notes >> >> The goal of the associated umbrella issue [JDK-8156534](https://bugs.openjdk.org/browse/JDK-8156534) is to, for `java.lang.String*` classes, >> >> 1. Move `@IntrinsicCandidate`-annotated `public` methods1 (in Java code) to `private` ones, and wrap them with a `public` ["front door" method](https://github.com/openjdk/jdk/pull/24982#discussion_r2087493446) >> 2. Since we moved the `@IntrinsicCandidate` annotation to a new method, intrinsic mappings ? i.e., associated `do_intrinsic()` calls in `vmIntrinsics.hpp` ? need to be updated too >> 3. Add necessary input validation (range, null, etc.) checks to the newly created public front door method >> 4. Place all input validation checks in the intrinsic code (add if missing!) behind a `VerifyIntrinsicChecks` VM flag >> >> Following preliminary work needs to be carried out as well: >> >> 1. Add a new `VerifyIntrinsicChecks` VM flag >> 2. Update `generate_string_range_check` to produce a `HaltNode`. That is, crash the VM if `VerifyIntrinsicChecks` is set and a Java wrapper fails to spot an invalid input. >> >> 1 `@IntrinsicCandidate`-annotated constructors are not subject to this change, since they are a special case. >> >> ## Functional and performance tests >> >> - `tier1` (which includes `test/hotspot/jtreg/compiler/intrinsics/string`) passes on several platforms. Further tiers will be executed after integrating reviewer feedback. >> >> - Performance impact is still actively monitored using `test/micro/org/openjdk/bench/java/lang/String{En,De}code.java`, among other tests. If you have suggestions on benchmarks, please share in the comments. >> >> ## Verification of the VM crash >> >> I've tested the VM crash scenario as follows: >> >> 1. Created the following test program: >> >> public class StrIntri { >> public static void main(String[] args) { >> Exception lastException = null; >> for (int i = 0; i < 1_000_000; i++) { >> try { >> jdk.internal.access.SharedSecrets.getJavaLangAccess().countPositives(new byte[]{1,2,3}, 2, 5); >> } catch (Exception exception) { >> lastException = exception; >> } >> } >> if (lastException != null) { >> lastException.printStackTrace... > > Volkan Yazici has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into strIntrinCheck > - Replace `requireNonNull` with implicit null checks to reduce bytecode size > - Add `@bug` tags > - Improve wording of `@param len` > - Make source array bound checks lenient too > - Cap destination array bounds > - Fix bit shifting > - Remove superseded `@throws` Javadoc > - Merge remote-tracking branch 'upstream/master' into strIntrinCheck > - Make `StringCoding` encoding intrinsics lenient > - ... and 21 more: https://git.openjdk.org/jdk/compare/502f1129...c322f0e0 src/hotspot/share/opto/library_call.cpp line 946: > 944: Node* count, > 945: bool char_count, > 946: bool halt) { Could we rename this to something more descriptive such as `halt_on_oob`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25998#discussion_r2266616962 From kbarrett at openjdk.org Mon Aug 11 13:14:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 11 Aug 2025 13:14:03 GMT Subject: RFR: 8365245: Move size reducing operations to GrowableArrayWithAllocator Message-ID: Please review this change to GrowableArray to move size reducing operations from GrowableArrayBase and GrowableArrayView to GrowableArrayWithAllocator. See JBS for rationale for this move. This is mostly just a simple code motion change, with the moved functions being updated to account for different name lookup rules now that they are in a class with a class template base class. For the most part this refactoring didn't require any client code changes. All but one use of one moved function was with a GrowableArrayWithAllocator-derived class, so not affected by moving the functions. The one exception was a test of ZArraySlice that was modified to no longer use GA::pop(). Testing: mach5 tier1. ------------- Commit messages: - move pop to GAWA, fixing gtest - move remove_at to GAWA - move remove_if_existing to GAWA - move remove to GAWA - move remove_till/remove_range to GAWA - move delete_at to GAWA - move trunc_to to GAWA - move clear() to GAWA Changes: https://git.openjdk.org/jdk/pull/26726/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26726&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365245 Stats: 134 lines in 2 files changed: 69 ins; 64 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26726.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26726/head:pull/26726 PR: https://git.openjdk.org/jdk/pull/26726 From kbarrett at openjdk.org Mon Aug 11 13:14:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 11 Aug 2025 13:14:03 GMT Subject: RFR: 8365245: Move size reducing operations to GrowableArrayWithAllocator In-Reply-To: References: Message-ID: <-i9UDvOkeQCBRT-__2W7gERYOXOtJ3H4njnIgSmCN2k=.faa7a8bf-45e9-4efe-90e8-1fa3d3fa37b8@github.com> On Mon, 11 Aug 2025 13:06:01 GMT, Kim Barrett wrote: > Please review this change to GrowableArray to move size reducing operations > from GrowableArrayBase and GrowableArrayView to GrowableArrayWithAllocator. > See JBS for rationale for this move. > > This is mostly just a simple code motion change, with the moved functions > being updated to account for different name lookup rules now that they are in > a class with a class template base class. For the most part this refactoring > didn't require any client code changes. All but one use of one moved function > was with a GrowableArrayWithAllocator-derived class, so not affected by moving > the functions. The one exception was a test of ZArraySlice that was modified > to no longer use GA::pop(). > > Testing: mach5 tier1. I didn't make any code changes to address some issues noticed while doing this refactoring. Some things I noticed and will file JBS issues for were: - GrowableArray::remove_range doesn't allow an empty range. Wondering why, as that's potentially inconvenient and contrary to usual practice. - GrowableArray::appendAll effectively does a bunch of push_backs, which can lead to multiple reallocations. It should expand capacity once up front, based on the size of the array being appended. - GrowableArray::remove_if_existing searches for the the item twice, once to detect it's presence (using `contains`), and again as part of `remove`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26726#issuecomment-3174744007 From vyazici at openjdk.org Mon Aug 11 13:47:41 2025 From: vyazici at openjdk.org (Volkan Yazici) Date: Mon, 11 Aug 2025 13:47:41 GMT Subject: RFR: 8361842: Move input validation checks to Java for java.lang.StringCoding intrinsics [v14] In-Reply-To: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> References: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> Message-ID: > Validate input in `java.lang.StringCoding` intrinsic Java wrappers, improve their documentation, enhance the checks in the associated IR or assembly code, and adapt them to cause VM crash on invalid input. > > ## Implementation notes > > The goal of the associated umbrella issue [JDK-8156534](https://bugs.openjdk.org/browse/JDK-8156534) is to, for `java.lang.String*` classes, > > 1. Move `@IntrinsicCandidate`-annotated `public` methods1 (in Java code) to `private` ones, and wrap them with a `public` ["front door" method](https://github.com/openjdk/jdk/pull/24982#discussion_r2087493446) > 2. Since we moved the `@IntrinsicCandidate` annotation to a new method, intrinsic mappings ? i.e., associated `do_intrinsic()` calls in `vmIntrinsics.hpp` ? need to be updated too > 3. Add necessary input validation (range, null, etc.) checks to the newly created public front door method > 4. Place all input validation checks in the intrinsic code (add if missing!) behind a `VerifyIntrinsicChecks` VM flag > > Following preliminary work needs to be carried out as well: > > 1. Add a new `VerifyIntrinsicChecks` VM flag > 2. Update `generate_string_range_check` to produce a `HaltNode`. That is, crash the VM if `VerifyIntrinsicChecks` is set and a Java wrapper fails to spot an invalid input. > > 1 `@IntrinsicCandidate`-annotated constructors are not subject to this change, since they are a special case. > > ## Functional and performance tests > > - `tier1` (which includes `test/hotspot/jtreg/compiler/intrinsics/string`) passes on several platforms. Further tiers will be executed after integrating reviewer feedback. > > - Performance impact is still actively monitored using `test/micro/org/openjdk/bench/java/lang/String{En,De}code.java`, among other tests. If you have suggestions on benchmarks, please share in the comments. > > ## Verification of the VM crash > > I've tested the VM crash scenario as follows: > > 1. Created the following test program: > > public class StrIntri { > public static void main(String[] args) { > Exception lastException = null; > for (int i = 0; i < 1_000_000; i++) { > try { > jdk.internal.access.SharedSecrets.getJavaLangAccess().countPositives(new byte[]{1,2,3}, 2, 5); > } catch (Exception exception) { > lastException = exception; > } > } > if (lastException != null) { > lastException.printStackTrace(); > } else { > System.out.println("completed"); > } > } > } > ... Volkan Yazici has updated the pull request incrementally with two additional commits since the last revision: - Rename `generate_string_range_check::halt` to `halt_on_oob` - Move `->asRegion()` after `if (halt)` ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25998/files - new: https://git.openjdk.org/jdk/pull/25998/files/c322f0e0..9b721bb9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25998&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25998&range=12-13 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25998.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25998/head:pull/25998 PR: https://git.openjdk.org/jdk/pull/25998 From vyazici at openjdk.org Mon Aug 11 13:47:44 2025 From: vyazici at openjdk.org (Volkan Yazici) Date: Mon, 11 Aug 2025 13:47:44 GMT Subject: RFR: 8361842: Move input validation checks to Java for java.lang.StringCoding intrinsics [v13] In-Reply-To: References: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> Message-ID: On Mon, 11 Aug 2025 12:43:38 GMT, Claes Redestad wrote: >> Volkan Yazici has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into strIntrinCheck >> - Replace `requireNonNull` with implicit null checks to reduce bytecode size >> - Add `@bug` tags >> - Improve wording of `@param len` >> - Make source array bound checks lenient too >> - Cap destination array bounds >> - Fix bit shifting >> - Remove superseded `@throws` Javadoc >> - Merge remote-tracking branch 'upstream/master' into strIntrinCheck >> - Make `StringCoding` encoding intrinsics lenient >> - ... and 21 more: https://git.openjdk.org/jdk/compare/357546c1...c322f0e0 > > src/hotspot/share/opto/library_call.cpp line 946: > >> 944: Node* count, >> 945: bool char_count, >> 946: bool halt) { > > Could we rename this to something more descriptive such as `halt_on_oob`? Changed as requested in 9b721bb9fee. > src/hotspot/share/opto/library_call.cpp line 964: > >> 962: >> 963: if (bailout->req() > 1) { >> 964: bailout = _gvn.transform(bailout)->as_Region(); > > Should this be done only for the case where it's needed, i.e., in the `if (halt)` branch? Not sure it has any side-effect but it seems prudent not to change the default code more than necessary. Changed as requested in 4d2a7a39c50. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25998#discussion_r2266824420 PR Review Comment: https://git.openjdk.org/jdk/pull/25998#discussion_r2266823319 From vyazici at openjdk.org Mon Aug 11 13:56:18 2025 From: vyazici at openjdk.org (Volkan Yazici) Date: Mon, 11 Aug 2025 13:56:18 GMT Subject: RFR: 8361842: Move input validation checks to Java for java.lang.StringCoding intrinsics [v13] In-Reply-To: References: <-o9qy_Jsa4tvizzy8WgngfPuz9tx8Cwcztzl9AXiB70=.9ddb14d6-ee47-448d-a5a5-9c9ac4db3e0c@github.com> Message-ID: On Wed, 6 Aug 2025 05:24:41 GMT, Shaojin Wen wrote: >> Volkan Yazici has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into strIntrinCheck >> - Replace `requireNonNull` with implicit null checks to reduce bytecode size >> - Add `@bug` tags >> - Improve wording of `@param len` >> - Make source array bound checks lenient too >> - Cap destination array bounds >> - Fix bit shifting >> - Remove superseded `@throws` Javadoc >> - Merge remote-tracking branch 'upstream/master' into strIntrinCheck >> - Make `StringCoding` encoding intrinsics lenient >> - ... and 21 more: https://git.openjdk.org/jdk/compare/565028e3...c322f0e0 > > src/java.base/share/classes/java/lang/StringCoding.java line 99: > >> 97: * {@linkplain Preconditions#checkFromIndexSize(int, int, int, BiFunction) out of bounds} >> 98: */ >> 99: static int countPositives(byte[] ba, int off, int len) { > > If we name countPositives with parameter checking as countPositivesSB, this PR will have fewer changes. I presume you mean we would not need to touch `vmIntrinsics.hpp` and such. I discussed this with @cl4es, and we decided to keep the _"`foo` for method, and `foo0` for intrinsic candidate"_ convention, since this matches the existing one. Unless more experienced maintainers tell do to otherwise, I will stick to the current style. @wenshao, nevertheless, thanks so much for your kind review(s). Please keep them coming. ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25998#discussion_r2266851757 From duke at openjdk.org Mon Aug 11 14:53:17 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Mon, 11 Aug 2025 14:53:17 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline Some results: | Features | Time1 (s) | Time2 (s) | Change (s) | |---------------------------------|----------|----------|-------------| | epsilongc | 490.77 | 489.50 | 1.27 | | epsilongc cds | 546.27 | 537.91 | 8.36 | | epsilongc compiler1 | 547.87 | 539.86 | 8.01 | | epsilongc compiler2 | 785.18 | 735.13 | 50.05 | | g1gc | 682.44 | 666.40 | 16.04 | | epsilongc services | 510.22 | 504.68 | 5.54 | | epsilongc services jfr | 675.49 | 636.87 | 38.62 | | epsilongc jni-check | 499.53 | 493.72 | 5.81 | | epsilongc compiler1 jvmci | 589.74 | 591.32 | -1.58 | | epsilongc compiler2 jvmci | 852.46 | 806.35 | 46.11 | | epsilongc services jvmti | 567.67 | 561.73 | 5.94 | | epsilongc link-time-opt | 535.96 | 526.00 | 9.97 | | epsilongc management | 508.63 | 504.53 | 4.10 | | epsilongc opt-size | 507.03 | 498.58 | 8.45 | | parallelgc | 525.78 | 517.38 | 8.40 | | serialgc | 512.59 | 511.88 | 0.71 | | shenandoahgc | 739.44 | 722.72 | 16.72 | | zgc | 681.20 | 672.28 | 8.92 | - `Time1` is the time (sys+user) it takes to complete `make -h hotspot` with the precompiled headers in be25d3413b432b56e9789eae55920f1862008911 (>=400 includes in a `custom` build) - `Time2`: same as `Time1`, with the precompiled headers having >=400 includes in a `custom` build with all features in the first column So, the third column measures how much we improve by taking new precompiled headers due to a specific features. I'd say the following are worth keeping behind an include guard: - `cds` - `compiler1` - `compiler2` - `g1gc` - `services` - `jfr` ==> `services` (#26723) - `jni-check` - `link-time-opt` - `opt-size` - `parallelgc` - `shenandoahgc` - `zgc` ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3175208900 From ihse at openjdk.org Mon Aug 11 15:52:21 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 15:52:21 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Mon, 11 Aug 2025 14:49:54 GMT, Francesco Andreuzzi wrote: > * `link-time-opt` > > * `opt-size` I thought these should not really affect the set of files compiled (or included, which is what matters here)? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3175481286 From jsjolen at openjdk.org Mon Aug 11 16:04:00 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 11 Aug 2025 16:04:00 GMT Subject: RFR: 8365264: Rename ResourceHashtable to HashTable Message-ID: Hi, This PR renames `ResourceHashtable` to `HashTable`. The "resource" part of the name isn't true anymore, it does many types of allocations. The RHT should be HT to signify that it's the default choice of hashtable in Hotspot. I didn't change any copyright years for this change. Thanks ------------- Commit messages: - Rename file - Rename ResourceHashtable to HashTable Changes: https://git.openjdk.org/jdk/pull/26729/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26729&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365264 Stats: 1841 lines in 67 files changed: 856 ins; 856 del; 129 mod Patch: https://git.openjdk.org/jdk/pull/26729.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26729/head:pull/26729 PR: https://git.openjdk.org/jdk/pull/26729 From duke at openjdk.org Mon Aug 11 16:27:15 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Mon, 11 Aug 2025 16:27:15 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: <7_2SHq1iU6SEIPWDiTaNhW6G0ChhcnsXtrVxMUgB0ts=.a407d743-9d86-4ab3-b905-3ea8cf8069f3@github.com> On Mon, 11 Aug 2025 15:49:55 GMT, Magnus Ihse Bursie wrote: >> Some results: >> >> | Features | Time1 (s) | Time2 (s) | Change (s) | >> |---------------------------------|----------|----------|-------------| >> | epsilongc | 490.77 | 489.50 | 1.27 | >> | epsilongc cds | 546.27 | 537.91 | 8.36 | >> | epsilongc compiler1 | 547.87 | 539.86 | 8.01 | >> | epsilongc compiler2 | 785.18 | 735.13 | 50.05 | >> | g1gc | 682.44 | 666.40 | 16.04 | >> | epsilongc services | 510.22 | 504.68 | 5.54 | >> | epsilongc services jfr | 675.49 | 636.87 | 38.62 | >> | epsilongc jni-check | 499.53 | 493.72 | 5.81 | >> | epsilongc compiler1 jvmci | 589.74 | 591.32 | -1.58 | >> | epsilongc compiler2 jvmci | 852.46 | 806.35 | 46.11 | >> | epsilongc services jvmti | 567.67 | 561.73 | 5.94 | >> | epsilongc link-time-opt | 535.96 | 526.00 | 9.97 | >> | epsilongc management | 508.63 | 504.53 | 4.10 | >> | epsilongc opt-size | 507.03 | 498.58 | 8.45 | >> | parallelgc | 525.78 | 517.38 | 8.40 | >> | serialgc | 512.59 | 511.88 | 0.71 | >> | shenandoahgc | 739.44 | 722.72 | 16.72 | >> | zgc | 681.20 | 672.28 | 8.92 | >> >> - `Time1` is the time (sys+user) it takes to complete `make -h hotspot` with the precompiled headers in be25d3413b432b56e9789eae55920f1862008911 (>=400 includes in a `custom` build) >> - `Time2`: same as `Time1`, with the precompiled headers having >=400 includes in a `custom` build with all features in the first column >> >> So, the third column measures how much we improve by taking new precompiled headers due to a specific features. I'd say the following are worth keeping behind an include guard: >> - `cds` >> - `compiler1` >> - `compiler2` >> - `g1gc` >> - `services` >> - `jfr` ==> `services` (#26723) >> - `jni-check` >> - `link-time-opt` >> - `opt-size` >> - `parallelgc` >> - `shenandoahgc` >> - `zgc` > >> * `link-time-opt` >> >> * `opt-size` > > I thought these should not really affect the set of files compiled (or included, which is what matters here)? That's right @magicus, I rerun the script and I got only a `1.86s` improvements for `link-time-opt`. I checked the includes in `precompiled.hpp` and indeed nothing changes when the analyzer runs on a build with `link-time-opt`. So, exactly the same build, just different build times. That's most likely noise. I'd say, as a criteria we could use is: - Improvement > 10s (unlikely to be noise, but still possible) - new headers added to `precompiled.hpp` is not empty ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3175758710 From aph at openjdk.org Mon Aug 11 17:06:22 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 11 Aug 2025 17:06:22 GMT Subject: RFR: 8328306: AArch64: MacOS lazy JIT "write xor execute" switching Message-ID: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> In MacOS/AArch64 HotSpot, we have to deal with the fact that a thread must be in one of two modes: it either may write to code cache memory or it may execute (and read) code or data in it. A system call `pthread_jit_write_protect_np(int enabled)` changes from one to the other. Today, we change mode whenever making a transition from interpreter to VM. This means that we change mode a lot: experiments have shown that during `jshell` startup we change mode 4 million times. Other experiments have shown that we only needed to change mode 45 thousand times. This "eager" mode switching is perhaps too eager, and we'd be better off switching lazily. While the system call that changes mode is very fast, mode switching still amounts to about 100ms of startup time. Switching eagerly also means that some native calls (e.g. to do arithmetic) are disproportionately expensive, given that they have no need of mode switching at all. The approach in this PR is to defer transitioning from exec-but-don't-write mode (`WXExec`) to write-but-don't-exec mode (`WXWrite`) until we need to write. Instead of enabling `WXWrite` immediately, we switch to a mode called `WXArmedForWrite`. When in this mode, when we need to write into code memory we call `os_bsd_jit_exec_enabled(false)` to enable writing and then set the current mode to `WXWrite`. We mark all sites that we know will write to code memory with `MACOS_AARCH64_ONLY(os::thread_wx_enable_write());` Judicious placement of these markers, such as when entering patching code, means that we have a fairly small number of these. We also keep track (in thread-local storage) of the current state of `pthread_jit_write_protect_np` in order to avoid making the system call unnecessarily. It is possible that we have missed some sites where we do need to make a transition from write-protected to -enabled. While we haven't seen any in testing, we have a fallback path. An attempt to write into code memory triggers a `SIGILL` signal. A signal handler detects this, and if the current mode `WXArmedForWrite` it changes mode to write-enabled and returns. In addition, the handler "heals" the VM entry point so that next time the same point is entered (and for the rest of the lifetime of the VM) it will immediately transition to `WXWrite`. One other possibility remains: we could omit all of the `wx_enable_write` markers and use healing instead. We've experimented with this. It works well enough, but is rather crude, and it's better to be able to see where transitions take place. ------------- Commit messages: - Cleanup - More - More - More - Remove debug code - More - More - More - Update src/hotspot/share/interpreter/interpreterRuntime.cpp - Update src/hotspot/share/runtime/javaThread.cpp - ... and 1 more: https://git.openjdk.org/jdk/compare/c239c0ab...b9709363 Changes: https://git.openjdk.org/jdk/pull/26562/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26562&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328306 Stats: 267 lines in 32 files changed: 209 ins; 19 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/26562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26562/head:pull/26562 PR: https://git.openjdk.org/jdk/pull/26562 From aph at openjdk.org Mon Aug 11 17:06:23 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 11 Aug 2025 17:06:23 GMT Subject: RFR: 8328306: AArch64: MacOS lazy JIT "write xor execute" switching In-Reply-To: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> References: <3IdZZGAKHVuMXfeM10Z-VSDNlJmcu5XFilLQfEKb9OY=.5213f5ca-2bca-41eb-b7ce-7621510552be@github.com> Message-ID: On Wed, 30 Jul 2025 17:34:48 GMT, Andrew Haley wrote: > In MacOS/AArch64 HotSpot, we have to deal with the fact that a thread must be in one of two modes: it either may write to code cache memory or it may execute (and read) code or data in it. A system call `pthread_jit_write_protect_np(int enabled)` changes from one to the other. > > Today, we change mode whenever making a transition from interpreter to VM. This means that we change mode a lot: experiments have shown that during `jshell` startup we change mode 4 million times. Other experiments have shown that we only needed to change mode 45 thousand times. > > This "eager" mode switching is perhaps too eager, and we'd be better off switching lazily. While the system call that changes mode is very fast, mode switching still amounts to about 100ms of startup time. Switching eagerly also means that some native calls (e.g. to do arithmetic) are disproportionately expensive, given that they have no need of mode switching at all. > > The approach in this PR is to defer transitioning from exec-but-don't-write mode (`WXExec`) to write-but-don't-exec mode (`WXWrite`) until we need to write. Instead of enabling `WXWrite` immediately, we switch to a mode called `WXArmedForWrite`. When in this mode, when we need to write into code memory we call `os_bsd_jit_exec_enabled(false)` to enable writing and then set the current mode to `WXWrite`. > > We mark all sites that we know will write to code memory with > `MACOS_AARCH64_ONLY(os::thread_wx_enable_write());` Judicious placement of these markers, such as when entering patching code, means that we have a fairly small number of these. > > We also keep track (in thread-local storage) of the current state of `pthread_jit_write_protect_np` in order to avoid making the system call unnecessarily. > > It is possible that we have missed some sites where we do need to make a transition from write-protected to -enabled. While we haven't seen any in testing, we have a fallback path. An attempt to write into code memory triggers a `SIGILL` signal. A signal handler detects this, and if the current mode `WXArmedForWrite` it changes mode to write-enabled and returns. In addition, the handler "heals" the VM entry point so that next time the same point is entered (and for the rest of the lifetime of the VM) it will immediately transition to `WXWrite`. > > One other possibility remains: we could omit all of the `wx_enable_write` markers and use healing instead. We've experimented with this. It works well enough, but is rather crude, and it's better to be able to ... > ### Progress > > * [ ] Change must be properly reviewed (1 review required, with at least 1 [Reviewer](https://openjdk.org/bylaws#reviewer)) > > * [x] Change must not contain extraneous whitespace > > * [x] Commit message must refer to an issue > > > ### Error > > ?? The pull request body must not be empty. > ### Integration blocker > > ?? Title mismatch between PR and JBS for issue [JDK-8328306](https://bugs.openjdk.org/browse/JDK-8328306) > ### Issue > > * [JDK-8328306](https://bugs.openjdk.org/browse/JDK-8328306): Allow VM to run with X memory execute by default (**Enhancement** - P4) ?? Title mismatch between PR and JBS. > > > ### Reviewing > Using `git` > > Using Skara CLI tools > > Using diff file > ### Progress > > * [ ] Change must be properly reviewed (1 review required, with at least 1 [Reviewer](https://openjdk.org/bylaws#reviewer)) > > * [x] Change must not contain extraneous whitespace > > * [x] Commit message must refer to an issue > > > ### Error > > ?? The pull request body must not be empty. > ### Integration blocker > > ?? Title mismatch between PR and JBS for issue [JDK-8328306](https://bugs.openjdk.org/browse/JDK-8328306) > ### Issue > > * [JDK-8328306](https://bugs.openjdk.org/browse/JDK-8328306): Allow VM to run with X memory execute by default (**Enhancement** - P4) ?? Title mismatch between PR and JBS. > > > ### Reviewing > Using `git` > > Using Skara CLI tools > > Using diff file src/hotspot/share/interpreter/interpreterRuntime.cpp line 392: > 390: > 391: > 392: Suggestion: src/hotspot/share/runtime/javaThread.cpp line 526: > 524: > 525: _lock_stack(this), > 526: _om_cache(this) { Suggestion: _om_cache(this) { ------------- PR Comment: https://git.openjdk.org/jdk/pull/26562#issuecomment-3175557349 PR Review Comment: https://git.openjdk.org/jdk/pull/26562#discussion_r2243461476 PR Review Comment: https://git.openjdk.org/jdk/pull/26562#discussion_r2243453769 From jsjolen at openjdk.org Mon Aug 11 17:32:11 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 11 Aug 2025 17:32:11 GMT Subject: RFR: 8365264: Rename ResourceHashtable to HashTable In-Reply-To: References: Message-ID: On Mon, 11 Aug 2025 15:57:41 GMT, Johan Sj?len wrote: > Hi, > > This PR renames `ResourceHashtable` to `HashTable`. The "resource" part of the name isn't true anymore, it does many types of allocations. The RHT should be HT to signify that it's the default choice of hashtable in Hotspot. > > I didn't change any copyright years for this change. > > Thanks Aha, of course I forgot about re-sorting the headers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26729#issuecomment-3176028919 From iris at openjdk.org Mon Aug 11 17:38:12 2025 From: iris at openjdk.org (Iris Clark) Date: Mon, 11 Aug 2025 17:38:12 GMT Subject: RFR: 8361950: Update to use jtreg 8 [v2] In-Reply-To: References: Message-ID: <3cVhuiuq1cW-XgONMnkCje3LI6yOMIxTmmiH5AStlTY=.e6a50bf3-0a7e-437b-bd66-3f6b59353b74@github.com> On Mon, 11 Aug 2025 11:35:12 GMT, Christian Stein wrote: >> Please review the change to update to using jtreg 8. >> >> The primary change is to the `jib-profiles.js` file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the requiredVersion has been updated in the various `TEST.ROOT` files. > > Christian Stein has updated the pull request incrementally with one additional commit since the last revision: > > Update to use correct library directories > > See also: https://openjdk.org/jtreg/faq.html#how-do-i-find-the-path-for-the-testng-or-junit-jar-files Marked as reviewed by iris (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26261#pullrequestreview-3106980594 From jbhateja at openjdk.org Mon Aug 11 18:02:34 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 11 Aug 2025 18:02:34 GMT Subject: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v4] In-Reply-To: References: Message-ID: <4omqHrPtNFE0UWmulPymwsUHXRpd9EBhgJvOpRyXxJQ=.dacad6cd-5a3e-4671-9543-98f04e1b7e73@github.com> > Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction. > It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails. > > Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java). > > Vector API jtreg tests pass at AVX level 2, remaining validation in progress. > > Performance numbers: > > > System : 13th Gen Intel(R) Core(TM) i3-1315U > > Baseline: > Benchmark (size) Mode Cnt Score Error Units > VectorSliceBenchmark.byteVectorSliceWithConstantIndex1 1024 thrpt 2 9444.444 ops/ms > VectorSliceBenchmark.byteVectorSliceWithConstantIndex2 1024 thrpt 2 10009.319 ops/ms > VectorSliceBenchmark.byteVectorSliceWithVariableIndex 1024 thrpt 2 9081.926 ops/ms > VectorSliceBenchmark.intVectorSliceWithConstantIndex1 1024 thrpt 2 6085.825 ops/ms > VectorSliceBenchmark.intVectorSliceWithConstantIndex2 1024 thrpt 2 6505.378 ops/ms > VectorSliceBenchmark.intVectorSliceWithVariableIndex 1024 thrpt 2 6204.489 ops/ms > VectorSliceBenchmark.longVectorSliceWithConstantIndex1 1024 thrpt 2 1651.334 ops/ms > VectorSliceBenchmark.longVectorSliceWithConstantIndex2 1024 thrpt 2 1642.784 ops/ms > VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt 2 1474.808 ops/ms > VectorSliceBenchmark.shortVectorSliceWithConstantIndex1 1024 thrpt 2 10399.394 ops/ms > VectorSliceBenchmark.shortVectorSliceWithConstantIndex2 1024 thrpt 2 10502.894 ops/ms > VectorSliceBenchmark.shortVectorSliceWithVariableIndex 1024 ... Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24104/files - new: https://git.openjdk.org/jdk/pull/24104/files/e7c7374b..405de56f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24104&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24104&range=02-03 Stats: 389 lines in 9 files changed: 373 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/24104.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24104/head:pull/24104 PR: https://git.openjdk.org/jdk/pull/24104 From duke at openjdk.org Mon Aug 11 18:32:17 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Mon, 11 Aug 2025 18:32:17 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline Just found a bug in the script, I'm rerunning all the combinations. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3176301836 From dlong at openjdk.org Mon Aug 11 20:52:13 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 11 Aug 2025 20:52:13 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 7 Aug 2025 15:49:20 GMT, Ruben wrote: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Re: CodeOffsets::DeoptMH, what I meant is that it is just an exact copy of the CodeOffsets::Deopt stub code. Apparently at some point in the JSR 292 evolution, we needed to distinguish between the two, but it looks to me that we no longer need to do that. So we should be able to get rid of DeoptMH and related code, like nmethod::is_deopt_mh_entry(). ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3176849923 From duke at openjdk.org Mon Aug 11 21:37:14 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Mon, 11 Aug 2025 21:37:14 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline | Feature | Baseline `precompiled.hpp` (Mean ?) | Updated `precompiled.hpp` (Mean ?) | |-----------------------|---------------------|----------------------| | epsilongc | 505.78 ? 9.08 | 499.36 ? 3.21 | | epsilongc cds | 523.93 ? 9.42 | 510.68 ? 0.99 | | epsilongc compiler1 | 514.89 ? 2.28 | 518.67 ? 14.22 | | epsilongc compiler2 | 758.34 ? 5.56 | 736.65 ? 32.95 | | g1gc | 674.96 ? 6.37 | 693.39 ? 23.25 | | epsilongc services jfr| 707.45 ? 2.08 | 670.55 ? 1.50 | | epsilongc jni-check | 522.61 ? 1.82 | 517.46 ? 1.67 | | epsilongc compiler1 jvmci | 604.92 ? 4.23 | 600.61 ? 8.79 | | epsilongc compiler2 jvmci | 836.85 ? 7.40 | 792.07 ? 16.18 | | epsilongc services jvmti | 549.48 ? 6.87 | 553.74 ? 7.67 | | epsilongc management | 486.56 ? 5.68 | 480.58 ? 4.59 | | epsilongc opt-size | 476.57 ? 1.87 | 476.71 ? 1.23 | | parallelgc | 497.28 ? 3.90 | 493.68 ? 2.19 | | serialgc | 477.84 ? 0.86 | 476.14 ? 1.32 | | epsilongc services | 476.99 ? 2.44 | 478.74 ? 3.47 | | shenandoahgc | 676.10 ? 5.35 | 623.43 ? 3.11 | | zgc | 620.95 ? 1.84 | 621.86 ? 4.35 | `?`: maximum and minimum measurements ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3176968277 From ihse at openjdk.org Mon Aug 11 22:38:13 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 11 Aug 2025 22:38:13 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline This looks like regressions for c1 and G1..? And/or a lot more variability in build time. Any idea what is causing that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3177090836 From duke at openjdk.org Tue Aug 12 00:54:14 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Tue, 12 Aug 2025 00:54:14 GMT Subject: RFR: 8365053: Refresh hotspot precompiled.hpp with headers based on current frequency [v10] In-Reply-To: References: Message-ID: On Fri, 8 Aug 2025 22:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I propose to refresh the included headers in hotspot `precompiled.hpp`. The current set of precompiled headers was refreshed in 2018, 7 years ago. I repeated the same operations and measurements after refreshing the set of precompiled headers according to the current usage frequency. >> >> These are the results I observed. Depending on the platform, the improvement is between 10 and 20% in terms of total work (user+sys). The results are in seconds. >> >> >> linux-x64 GCC >> master real 81.39 user 3352.15 sys 287.49 >> JDK-8365053 real 81.94 user 3030.24 sys 295.82 >> >> linux-x64 Clang >> master real 43.44 user 2082.93 sys 130.70 >> JDK-8365053 real 38.44 user 1723.80 sys 117.68 >> >> linux-aarch64 GCC >> master real 1188.08 user 2015.22 sys 175.53 >> JDK-8365053 real 1019.85 user 1667.45 sys 171.86 >> >> linux-aarch64 clang >> master real 981.77 user 1645.05 sys 118.60 >> JDK-8365053 real 791.96 user 1262.92 sys 101.50 > > Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: > > - magic number: 400 > - inline I'm starting to think the high variability is a quirk of my environment. I ran the builds in a virtualized machine and I'm probably experiencing erratic behavior under high load. I re-ran each combination fewer times (3, should still be enough to get an idea), and I'm observing more reasonable results: | Feature | Baseline `precompiled.hpp` | Updated `precompiled.hpp` | |----------------------------|-----------------|-----------------| | epsilongc | 458.88 +/- 0.35 | 459.69 +/- 0.34 | | epsilongc cds | 501.63 +/- 0.35 | 501.58 +/- 0.57 | | epsilongc compiler1 | 505.66 +/- 0.61 | 505.06 +/- 0.51 | | epsilongc compiler2 | 725.91 +/- 0.41 | 692.25 +/- 0.40 | | g1gc | 630.73 +/- 0.02 | 624.24 +/- 0.35 | | epsilongc services jfr | 625.54 +/- 1.45 | 592.37 +/- 1.00 | | epsilongc jni-check | 461.19 +/- 0.14 | 461.79 +/- 0.89 | | epsilongc compiler1 jvmci | 541.31 +/- 0.42 | 540.31 +/- 0.70 | | epsilongc compiler2 jvmci | 758.13 +/- 0.71 | 724.80 +/- 0.21 | | epsilongc services jvmti | 499.64 +/- 0.18 | 499.69 +/- 0.52 | | epsilongc management | 450.82 +/- 0.62 | 451.38 +/- 0.59 | | epsilongc opt-size | 446.44 +/- 0.56 | 446.16 +/- 0.12 | | parallelgc | 467.23 +/- 0.19 | 466.89 +/- 0.41 | | serialgc | 454.17 +/- 0.40 | 454.22 +/- 0.12 | | epsilongc services | 453.82 +/- 0.21 | 453.59 +/- 0.51 | | shenandoahgc | 658.17 +/- 0.50 | 647.44 +/- 0.40 | [This](https://github.com/fandreuz/jdk-stuff/blob/master/script) is the script I used. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26681#issuecomment-3177333952 From sspitsyn at openjdk.org Tue Aug 12 03:54:12 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Aug 2025 03:54:12 GMT Subject: RFR: 8365192: post_meth_exit should be in vm state when calling get_jvmti_thread_state In-Reply-To: References: Message-ID: On Sat, 9 Aug 2025 20:30:14 GMT, Leonid Mesnik wrote: > The method > get_jvmti_thread_state() > should be called only while thread is in vm state. > > The post_method_exit is doing some preparation before switching to vm state. This cause issues if thread is needed to initialize jvmti thread state. > > The fix was found using jvmti stress agent and thus no additional regression test is required. Thank you for catching and addressing this! How was the fix tested? It looks okay at a glance but may give surprises. ------------- PR Review: https://git.openjdk.org/jdk/pull/26713#pullrequestreview-3108454887