From duke at openjdk.org Thu May 1 02:05:01 2025 From: duke at openjdk.org (ExE Boss) Date: Thu, 1 May 2025 02:05:01 GMT Subject: RFR: 8354897: Support Soft/Weak Reference in AOT cache [v9] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 17:45:04 GMT, Ioi Lam wrote: >> This PR contains 2 parts >> >> - Upstream of Soft/Weak Reference support authored by @macarte from [the Leyden repo](https://github.com/openjdk/leyden/commit/4ca75d156519596e23abc8a312496b7c2f0e0ca5) >> - New C++ class `AOTReferenceObjSupport` and new Java method `ReferencedKeyMap::prepareForAOTCache()` developed by @iklam on the advice of @fisk from the GC team. These control the lifecycles of reference objects during the assembly phase to simplify the implementation. >> >> One problem we faced in this PR is the handling of Reference objects that are waiting for clean up. Currently, the only cached Reference objects that require clean up are the `WeakReferenceKey`s used by `ReferencedKeyMap` (which is used by `MethodType::internTable`): >> >> - When the referent of a `WeakReferenceKey` K has been collected, the key will be placed on `Universe::reference_pending_list()`. It's linked to other pending references with the `Reference::discovered` field. At this point, K is still stored in the `ReferencedKeyMap`. >> - When heapShared.cpp discovered the `ReferencedKeyMap`, it will discover K, and it may also discover other pending references that are not intended for the AOT cache. As a result, we end up caching unnecessary objects. >> >> `ReferencedKeyMap::prepareForAOTCache()` avoids the above problem. It goes over all entries in the table: >> >> - If an entry has not yet been collected, we make sure it will never be collected. >> - If an entry has been collected, we remove it from the table >> >> Therefore, by the time heapShared.cpp starts scanning the `ReferencedKeyMap`, it will never see any keys that are on the pending list, so we will not see unintended objects. >> >> This implementation is the very first step of Reference support in the AOT cache, so we chose a simplified approach that makes no assumptions on when the pending reference list is processed. This is sufficient for the current set of references objects in the AOT cache. >> >> In the future, we may relax the implementation to allow for other use cases. > > Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: > > - @AlanBateman comments > - @xmas92 comments Changes requested by ExE-Boss at github.com (no known OpenJDK username). src/java.base/share/classes/java/lang/ref/Reference.java line 314: > 312: } > 313: > 314: private static void runtimeSetup() { The?comment was?added in?the?wrong?spot: Suggestion: static { runtimeSetup(); } // Called from JVM when loading an AOT cache private static void runtimeSetup() { ------------- PR Review: https://git.openjdk.org/jdk/pull/24757#pullrequestreview-2809022545 PR Review Comment: https://git.openjdk.org/jdk/pull/24757#discussion_r2069766184 From iklam at openjdk.org Thu May 1 03:48:57 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 1 May 2025 03:48:57 GMT Subject: RFR: 8354897: Support Soft/Weak Reference in AOT cache [v9] In-Reply-To: References: Message-ID: On Thu, 1 May 2025 02:01:36 GMT, ExE Boss wrote: >> Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: >> >> - @AlanBateman comments >> - @xmas92 comments > > src/java.base/share/classes/java/lang/ref/Reference.java line 314: > >> 312: } >> 313: >> 314: private static void runtimeSetup() { > > The?comment was?added in?the?wrong?spot: > Suggestion: > > static { > runtimeSetup(); > } > > // Called from JVM when loading an AOT cache > private static void runtimeSetup() { Oops, I will fix in an related PR #24979 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24757#discussion_r2069812435 From liach at openjdk.org Thu May 1 04:45:46 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 1 May 2025 04:45:46 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes In-Reply-To: References: Message-ID: <_uZd7RSmdekF9WeP5qXOOl24Lf00VgithwXEtQcQ9dM=.70173f42-a29e-4a5a-a43f-933b94a8506f@github.com> On Wed, 30 Apr 2025 21:49:12 GMT, Chen Liang wrote: >> This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). >> >> AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) >> >> In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. >> >> I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. > > src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 1533: > >> 1531: } >> 1532: >> 1533: private static void runtimeSetup() { > > Suggestion: > > > // Called from JVM when loading an AOT cache > private static void runtimeSetup() { Same problem in Reference, credit to @exe-boss ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24956#discussion_r2069844711 From shade at openjdk.org Thu May 1 06:15:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 May 2025 06:15:45 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> Message-ID: On Wed, 30 Apr 2025 20:58:59 GMT, Martin Doerr wrote: >> Thanks for this. I didn't know why Aleksey suggested it. I'll remove it. > > Maybe @shipilev meant `memory_order_release`? Anyway, I guess we don't need to optimize it. I saw no point in enforcing memory ordering mode here, as it looks like we only did `ThreadCritical` for mutual exclusion. Note that we do not have a matching acquire on list traversals, so seqcst/release on list additions would be incomplete. That only reinforces my original thinking: we are riding on memory ordering given by something else, I'd guess the initialization sequence itself. But I won't quibble, it is a very minor optimization. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2069890054 From shade at openjdk.org Thu May 1 06:15:44 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 May 2025 06:15:44 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v4] In-Reply-To: <9fsnmTdCxPkiaroCOh8qx0y0yAydVvtvdj08Wt-oMT8=.a4896930-7fdd-426b-bd0e-bb48479ceba3@github.com> References: <9fsnmTdCxPkiaroCOh8qx0y0yAydVvtvdj08Wt-oMT8=.a4896930-7fdd-426b-bd0e-bb48479ceba3@github.com> Message-ID: On Wed, 30 Apr 2025 20:39:30 GMT, Coleen Phillimore wrote: >> Use LockFreeStack to link events on the eventLog queue. They are never popped so this requires no further synchronization. >> Tested by tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > remove memory order specification Looks fine, thanks. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24954#pullrequestreview-2809220284 From duke at openjdk.org Thu May 1 07:34:55 2025 From: duke at openjdk.org (simon) Date: Thu, 1 May 2025 07:34:55 GMT Subject: Integrated: 8354292: Remove unused PRAGMA_FORMAT_IGNORED In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 01:22:24 GMT, simon wrote: > The macro PRAGMA_FORMAT_IGNORED is defined in compilerWarnings_gcc.hpp, with default empty definition in compilerWarnings.hpp. It is unused and can be removed. This pull request has now been integrated. Changeset: b2184105 Author: Gustavo Simon Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/b2184105088a21d0c55fd3105e3433d4eac767da Stats: 5 lines in 2 files changed: 0 ins; 5 del; 0 mod 8354292: Remove unused PRAGMA_FORMAT_IGNORED Reviewed-by: mbaesken, kbarrett, shade ------------- PR: https://git.openjdk.org/jdk/pull/24958 From shade at openjdk.org Thu May 1 07:34:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 May 2025 07:34:54 GMT Subject: RFR: 8354292: Remove unused PRAGMA_FORMAT_IGNORED In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 01:22:24 GMT, simon wrote: > The macro PRAGMA_FORMAT_IGNORED is defined in compilerWarnings_gcc.hpp, with default empty definition in compilerWarnings.hpp. It is unused and can be removed. Looks fine. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24958#pullrequestreview-2809283207 From mli at openjdk.org Thu May 1 08:52:45 2025 From: mli at openjdk.org (Hamlin Li) Date: Thu, 1 May 2025 08:52:45 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: <1hfhvGjxKFAYEtj1D_pIdgU659AE2oPWoQEyXl8sRgQ=.3aa62617-142a-49c9-82c4-0f761cb73aff@github.com> References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> <3BMQiQtyRXIj-NFUoFPliNYV4r1nX3KpKgniMvtOMkc=.cdcd240e-ce85-46c7-9bfb-e5be5124aae9@github.com> <1hfhvGjxKFAYEtj1D_pIdgU659AE2oPWoQEyXl8sRgQ=.3aa62617-142a-49c9-82c4-0f761cb73aff@github.com> Message-ID: On Wed, 30 Apr 2025 19:41:52 GMT, Vladimir Ivanov wrote: > Overall, it still looks like a JDK build issue to me. Hiding problems occurred during the build is not good. If some toolchains can't successfully build the library, the library shouldn't be included in JDK. No, in riscv case (possiblely also on arm?) I don't think it's a build issue, the jdk vendor can choose to support it or not, it's just the way passing the information ( whether it's supported or not) are different. How about we consider this in another way, we could fix this issue first, as it fails regularly on riscv and x64 in some situations. Then if you still consider the existing behaviour should be changed and could be improved further, it can be done in another pr. After all, it already broke the existing jdk in some scenarios. How do you think about it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2844401476 From zgu at openjdk.org Thu May 1 12:32:49 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Thu, 1 May 2025 12:32:49 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v4] In-Reply-To: <9fsnmTdCxPkiaroCOh8qx0y0yAydVvtvdj08Wt-oMT8=.a4896930-7fdd-426b-bd0e-bb48479ceba3@github.com> References: <9fsnmTdCxPkiaroCOh8qx0y0yAydVvtvdj08Wt-oMT8=.a4896930-7fdd-426b-bd0e-bb48479ceba3@github.com> Message-ID: On Wed, 30 Apr 2025 20:39:30 GMT, Coleen Phillimore wrote: >> Use LockFreeStack to link events on the eventLog queue. They are never popped so this requires no further synchronization. >> Tested by tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > remove memory order specification LGTM ------------- Marked as reviewed by zgu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24954#pullrequestreview-2809701348 From jwaters at openjdk.org Thu May 1 13:27:54 2025 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 1 May 2025 13:27:54 GMT Subject: RFR: 8345265: Minor improvements for LTO across all compilers [v2] In-Reply-To: References: Message-ID: <2c2RGsRfSG3FXZWT21NSGQrXCfJxgCN04jMwxQjlDjg=.632416df-1f45-48a1-ba4e-d3495f013ce0@github.com> On Tue, 17 Dec 2024 14:54:03 GMT, Julian Waters wrote: >> This is a general cleanup and improvement of LTO, as well as a quick fix to remove a workaround in the Makefiles that disabled LTO for g1ParScanThreadState.cpp due to the old poisoning mechanism causing trouble. The -Wno-attribute-warning change here can be removed once Kim's new poisoning solution is integrated. >> >> - -fno-omit-frame-pointer is added to gcc to stop the linker from emitting code without the frame pointer >> - -flto is set to $(JOBS) instead of auto to better match what the user requested >> - -Gy is passed to the Microsoft compiler. This does not fully fix LTO under Microsoft, but prevents warnings about -LTCG:INCREMENTAL at least > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-16 > - -fno-omit-frame-pointer in JvmFeatures.gmk > - Revert compilerWarnings_gcc.hpp > - General LTO fixes JvmFeatures.gmk > - Revert DISABLE_POISONING_STOPGAP compilerWarnings_gcc.hpp > - Merge branch 'openjdk:master' into patch-16 > - Revert os.cpp > - Fix memory leak in jvmciEnv.cpp > - Stopgap fix in os.cpp > - Declaration fix in compilerWarnings_gcc.hpp > - ... and 2 more: https://git.openjdk.org/jdk/compare/0b5a830e...9d05cb8e Keep open ------------- PR Comment: https://git.openjdk.org/jdk/pull/22464#issuecomment-2844843415 From gziemski at openjdk.org Thu May 1 14:12:30 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 1 May 2025 14:12:30 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: use permit_forbidden_function for realloc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/1bf32149..c98123fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=08-09 Stats: 6 lines in 1 file changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From shade at openjdk.org Thu May 1 16:59:56 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 May 2025 16:59:56 GMT Subject: RFR: 8356027: Print enhanced compilation timings Message-ID: In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: 1. Time spent before queuing: shows the compilation queue bottlenecks 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load 3. Time spent actually compiling: shows the per-method compilation costs We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). The difference from the output format we ship in Leyden: 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. See the sample `-XX:+PrintCompilation` output in the comments. Additional testing: - [x] Linux x86_64 server fastdebug, `compiler` - [ ] Linux x86_64 server fastdebug, `all` ------------- Commit messages: - More touchups - Fix TypeProfileFinalMethod as well - Fix inline tree printing - Touchups - Fix Changes: https://git.openjdk.org/jdk/pull/24984/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356027 Stats: 89 lines in 8 files changed: 57 ins; 6 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From shade at openjdk.org Thu May 1 16:59:57 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 May 2025 16:59:57 GMT Subject: RFR: 8356027: Print enhanced compilation timings In-Reply-To: References: Message-ID: <6925Oi48llGODmHNCO2x9kUKAoeM5lCxCXlX5IH6ly0=.9a2fd09b-21ca-48ad-989f-5451e32485b3@github.com> On Thu, 1 May 2025 12:35:38 GMT, Aleksey Shipilev wrote: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [ ] Linux x86_64 server fastdebug, `all` Sample log after: $ build/linux-x86_64-server-release/images/jdk/bin/java -XX:+PrintCompilation Hello.java ... 344 0 1351 3 java.util.Arrays::hashCode (15 bytes) started 345 0 1350 3 java.util.Objects::hash (5 bytes) started 345 0 1352 3 jdk.internal.util.ArraysSupport::hashCode (42 bytes) started 345 0 37 79 1349 3 java.lang.Byte:: (10 bytes) 345 0 1353 3 java.lang.Byte::hashCode (8 bytes) started 345 0 77 92 1351 3 java.util.Arrays::hashCode (15 bytes) 345 0 1354 3 java.lang.Byte::hashCode (2 bytes) started 345 0 212 66 1353 3 java.lang.Byte::hashCode (8 bytes) 345 0 276 38 1354 3 java.lang.Byte::hashCode (2 bytes) 345 0 139 167 1352 3 jdk.internal.util.ArraysSupport::hashCode (42 bytes) 345 0 79 354 1348 3 java.lang.invoke.MemberName::hashCode (43 bytes) 345 0 123 107 1350 3 java.util.Objects::hash (5 bytes) ... This shows, for example, that C1 (tier3) compilations of these small methods are really quick, and the queueing delays are comparable to the actual compilation costs. Caught a failure in `compiler/inlining/LateInlinePrinting.java` -- we apparently rely in `-XX:+PrintInlining` to print the task before compilation, and then print inline tree after the compilation! With new `started` message there is a test error in matching. We also (reasonably) hold no `ttyLock` in between, so inline tree is not attributed well to the particular method if there are multiple compiler threads. This is a blessing in disguise: with this change, we finally can start printing the inline tree after successful compiles under the `ttyLock` now. This also does `-XX:+PrintInlining` output only when `-XX:+PrintCompilation` is supplied (otherwise, where are you inlining into?), which I think is reasonable. Another test needs adjustments for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2844782421 PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2845231746 From iveresov at openjdk.org Thu May 1 16:58:39 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Thu, 1 May 2025 16:58:39 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v9] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Port 8355915: [leyden] Crash in MDO clearing the unloaded array type ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/b937681e..ee6bd11d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=07-08 Stats: 17 lines in 4 files changed: 6 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From duke at openjdk.org Thu May 1 17:40:01 2025 From: duke at openjdk.org (duke) Date: Thu, 1 May 2025 17:40:01 GMT Subject: Withdrawn: 8333151: Investigate if the Hotspot Arena chunk pools still make sense In-Reply-To: References: Message-ID: <7cVdVatABn8oJ2djMZdTCriAX_NYqGlK45vDpuIRQo4=.6c871e02-69e9-42ea-b218-691655fa43b6@github.com> On Wed, 31 Jul 2024 23:14:41 GMT, Afshin Zafari wrote: > Using `ChunkPool` or not is investigated in this PR based on time and memory consumption. > Based on the tests using ChunkPool shows no better speed nor memory footprint. > Memory usage is taken from RSS reports of Linux API. (GHA tests for non-linux platforms fail) > These improvements should be confirmed also in more related micro-benchmarks. > This PR is created only to receive feedbacks on the measurements and comparisons. > Anyway, since the purpose of the PR is investigation only, it won't be merged to the mainline. > Sample output: > > Total time, no pool, alloc: 190, free: 1 > Total time, no pool, alloc: 68, free: 1 > Total time, no pool, alloc: 56, free: 1 > Total time, no pool, alloc: 36, free: 1 > Total time, no pool, alloc: 37, free: 1 > Total time, with pool, alloc: 201, free: 12 > Total time, with pool, alloc: 190, free: 13 > Total time, with pool, alloc: 189, free: 13 > Total time, with pool, alloc: 190, free: 13 > Total time, with pool, alloc: 189, free: 13 > > RSS(KB): no-pool= 14524, pool= 735464, diff=-720940 > RSS(KB): no-pool= 480, pool= 34840, diff=-34360 > RSS(KB): no-pool= 2560, pool= 22036, diff=-19476 > RSS(KB): no-pool= 128, pool= 21580, diff=-21452 > RSS(KB): no-pool= -28, pool= 7732, diff=-7760 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/20411 From vlivanov at openjdk.org Thu May 1 18:29:50 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 1 May 2025 18:29:50 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> Message-ID: On Mon, 28 Apr 2025 10:34:49 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > > Before [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), when a released jdk not supportting sleef (for any reason, e.g. low gcc version, intrinsic not supported, rvv not supported, and so on) runs on machine support vector operation (e.g. on riscv, it supports rvv), it can not call into sleef, but will not fail either, it falls back to java scalar version implementation. > But after [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), it will cause an exception thrown at runtime. > > This change the behaviour of existing jdk, and it should not throw exception anyway. > > @iwanowww @RealFYang > > Thanks! I want to understand the issue with missing entries in vector math native libraries first before making a decision how to proceed. > I don't think it's a build issue, the jdk vendor can choose to support it or not, it's just the way passing the information ( whether it's supported or not) are different. Sorry, I don't get it. How does it affect the contents of the native library? JDK vendors do have an option to bundle the library or drop it from their distribution. But when SLEEF-based and SVML math libraries are built by JDK there's no distinction between entries being included in the library. If a vendor modifies make files or native library code, it's up to them to adjust JDK accordingly. Upstream JDK doesn't have to take such scenarios into account. I looked through SVML and SLEEF-based vector math code and noticed there are some capability check [1] [2] [3] guarding library code. It means that if some library entry is missing, then the whole library is empty. Can you confirm it's the case you see? [1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/linux/native/libjsvml/globals_vectorApiSupport_linux.S.inc#L35 [2] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/windows/native/libjsvml/globals_vectorApiSupport_windows.S.inc#L28 [3] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/unix/native/libsleef/lib/vector_math_rvv.c#L36 ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2845433084 From iveresov at openjdk.org Thu May 1 19:34:34 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Thu, 1 May 2025 19:34:34 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v10] In-Reply-To: References: Message-ID: <4uRz9S2VvUHduPnG2Vnh3v-AbRtoB86mM1A9sJBLZ30=.840a3c9b-ada1-4ba4-b8d8-af4e94607556@github.com> > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Fix semantics change from the previous commit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/ee6bd11d..014b0ec5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From adinn at openjdk.org Thu May 1 18:54:50 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 1 May 2025 18:54:50 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes In-Reply-To: References: Message-ID: <5vOdChaItphSz0dAvDqdniRjHRAAzeUBu2e7rxMkS54=.05079043-e02a-4853-891e-c7d34919af8d@github.com> On Tue, 29 Apr 2025 22:59:29 GMT, Ioi Lam wrote: > This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). > > AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) > > In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. > > I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. @iklam We have seen this problem with Red Hat deployments in jdk24 as well as jdk25-ea. I'm saying that mostly for information. However, I do have to ask: If this is fixed for jdk25 is there any question of also fixing it in jdk24? I would be content to receive a no answer -- a similar issue with patch that could be backported from jdk26 -> jdk25 might be something to think about a bit more? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24956#issuecomment-2845492119 From shade at openjdk.org Thu May 1 19:19:32 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 May 2025 19:19:32 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: References: Message-ID: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Test TestDuplicatedLateInliningOutput.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24984/files - new: https://git.openjdk.org/jdk/pull/24984/files/2c7e2154..1a3b5a31 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From swen at openjdk.org Fri May 2 03:04:02 2025 From: swen at openjdk.org (Shaojin Wen) Date: Fri, 2 May 2025 03:04:02 GMT Subject: RFR: 8268829: Provide an optimized way to walk the stack with Class object only [v12] In-Reply-To: References: Message-ID: On Thu, 7 Sep 2023 19:27:14 GMT, Mandy Chung wrote: >> 8268829: Provide an optimized way to walk the stack with Class object only >> >> `StackWalker::walk` creates one `StackFrame` per frame and the current implementation >> allocates one `StackFrameInfo` and one `MemberName` objects per frame. Some frameworks >> like logging may only interest in the Class object but not the method name nor the BCI, >> for example, filters out its implementation classes to find the caller class. It's >> similar to `StackWalker::getCallerClass` but allows a predicate to filter out the element. >> >> This PR proposes to add `Option::DROP_METHOD_INFO` enum that requests to drop the method information. If no method information is needed, a `StackWalker` with `DROP_METHOD_INFO` >> can be used instead and such stack walker will save the overhead of extracting the method information >> and the memory used for the stack walking. >> >> New factory methods to take a parameter to specify the kind of stack walker to be created are defined. >> This provides a simple way for existing code, for example logging frameworks, to take advantage of >> this enhancement with the least change as it can keep the existing function for traversing >> `StackFrame`s. >> >> For example: to find the first caller filtering a known list of implementation class, >> existing code can create a stack walker instance with `DROP_METHOD_INFO` option: >> >> >> StackWalker walker = StackWalker.getInstance(Option.DROP_METHOD_INFO, Option.RETAIN_CLASS_REFERENCE); >> Optional> callerClass = walker.walk(s -> >> s.map(StackFrame::getDeclaringClass) >> .filter(Predicate.not(implClasses::contains)) >> .findFirst()); >> >> >> If method information is accessed on the `StackFrame`s produced by this stack walker such as >> `StackFrame::getMethodName`, then `UnsupportedOperationException` will be thrown. >> >> #### Javadoc & specdiff >> >> https://cr.openjdk.org/~mchung/api/java.base/java/lang/StackWalker.html >> https://cr.openjdk.org/~mchung/jdk22/specdiff/overview-summary.html >> >> #### Alternatives Considered >> One alternative is to provide a new API: >> ` T walkClass(Function, ? extends T> function)` >> >> In this case, the caller would need to pass a function that takes a stream >> of `Class` object instead of `StackFrame`. Existing code would have to >> modify calls to the `walk` method to `walkClass` and the function body. >> >> ### Implementation Details >> >> A `StackWalker` configured with `DROP_METHOD_INFO` ... > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > Fix @Param due to the rename from default to class+method The value of flags can only be 0 or 0x08000000, so flags & MEMBER_INFO_FLAGS in isHidden and isCallerSensitive methods are always false ClassFrameInfo(StackWalker walker) { this.flags = walker.retainClassRef ? RETAIN_CLASS_REF_BIT : 0; } private static final int MEMBER_INFO_FLAGS = 0x00FFFFFF; private static final int RETAIN_CLASS_REF_BIT = 0x08000000; // retainClassRef The flags field is only assigned in the ClassFrameInfo constructor, so flags can be set to final, and the value here will only be 0 or RETAIN_CLASS_REF_BIT (0x08000000). So `flags & MEMBER_INFO_FLAGS` in the following code is always false boolean isCallerSensitive() { return JLIA.isCallerSensitive(flags & MEMBER_INFO_FLAGS); } boolean isHidden() { return JLIA.isHiddenMember(flags & MEMBER_INFO_FLAGS); } ------------- PR Comment: https://git.openjdk.org/jdk/pull/15370#issuecomment-2846210254 From swen at openjdk.org Fri May 2 03:36:06 2025 From: swen at openjdk.org (Shaojin Wen) Date: Fri, 2 May 2025 03:36:06 GMT Subject: RFR: 8268829: Provide an optimized way to walk the stack with Class object only [v12] In-Reply-To: References: Message-ID: On Thu, 7 Sep 2023 19:27:14 GMT, Mandy Chung wrote: >> 8268829: Provide an optimized way to walk the stack with Class object only >> >> `StackWalker::walk` creates one `StackFrame` per frame and the current implementation >> allocates one `StackFrameInfo` and one `MemberName` objects per frame. Some frameworks >> like logging may only interest in the Class object but not the method name nor the BCI, >> for example, filters out its implementation classes to find the caller class. It's >> similar to `StackWalker::getCallerClass` but allows a predicate to filter out the element. >> >> This PR proposes to add `Option::DROP_METHOD_INFO` enum that requests to drop the method information. If no method information is needed, a `StackWalker` with `DROP_METHOD_INFO` >> can be used instead and such stack walker will save the overhead of extracting the method information >> and the memory used for the stack walking. >> >> New factory methods to take a parameter to specify the kind of stack walker to be created are defined. >> This provides a simple way for existing code, for example logging frameworks, to take advantage of >> this enhancement with the least change as it can keep the existing function for traversing >> `StackFrame`s. >> >> For example: to find the first caller filtering a known list of implementation class, >> existing code can create a stack walker instance with `DROP_METHOD_INFO` option: >> >> >> StackWalker walker = StackWalker.getInstance(Option.DROP_METHOD_INFO, Option.RETAIN_CLASS_REFERENCE); >> Optional> callerClass = walker.walk(s -> >> s.map(StackFrame::getDeclaringClass) >> .filter(Predicate.not(implClasses::contains)) >> .findFirst()); >> >> >> If method information is accessed on the `StackFrame`s produced by this stack walker such as >> `StackFrame::getMethodName`, then `UnsupportedOperationException` will be thrown. >> >> #### Javadoc & specdiff >> >> https://cr.openjdk.org/~mchung/api/java.base/java/lang/StackWalker.html >> https://cr.openjdk.org/~mchung/jdk22/specdiff/overview-summary.html >> >> #### Alternatives Considered >> One alternative is to provide a new API: >> ` T walkClass(Function, ? extends T> function)` >> >> In this case, the caller would need to pass a function that takes a stream >> of `Class` object instead of `StackFrame`. Existing code would have to >> modify calls to the `walk` method to `walkClass` and the function body. >> >> ### Implementation Details >> >> A `StackWalker` configured with `DROP_METHOD_INFO` ... > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > Fix @Param due to the rename from default to class+method Sorry, I misread it. There is a comment in flags: `updated by VM to set hidden and caller-sensitive bits` ------------- PR Comment: https://git.openjdk.org/jdk/pull/15370#issuecomment-2846236653 From mchevalier at openjdk.org Fri May 2 07:21:45 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Fri, 2 May 2025 07:21:45 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 16:56:54 GMT, Vladimir Kozlov wrote: > I would not do that. First compilation of run() will be OSR and we will never run it fully compiled. You need several iterations in main() to trigger and use normal compilation. But 100 iterations should be fine. This should put execution time under 1 sec. I thought so too, but actually, `run()` is OSR compiled at first (which doesn't reproduce) and then fully compiled (where the crash happens). I don't understand why, but I can see it happening. Of course, I can also make a hundred iterations, that is cheap enough. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24817#issuecomment-2846539478 From mdoerr at openjdk.org Fri May 2 08:17:46 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 2 May 2025 08:17:46 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> Message-ID: On Thu, 1 May 2025 06:12:16 GMT, Aleksey Shipilev wrote: >> Maybe @shipilev meant `memory_order_release`? Anyway, I guess we don't need to optimize it. > > I saw no point in enforcing memory ordering mode here, as it looks like we only did `ThreadCritical` for mutual exclusion. Note that we do not have a matching acquire on list traversals, so seqcst/release on list additions would be incomplete. That only reinforces my original thinking: we are riding on memory ordering given by something else, I'd guess the initialization sequence itself. > > But I won't quibble, it is a very minor optimization. I guess we rely on memory ordering by address dependency on the reader's side. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2071263823 From sspitsyn at openjdk.org Fri May 2 08:20:06 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 2 May 2025 08:20:06 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v5] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - ?Merge? - review: added general comment about sync between suspend_thread and resume_thread - Merge - some cleanup - 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/df99ba15..f2c4a136 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=03-04 Stats: 289018 lines in 2561 files changed: 84046 ins; 194950 del; 10022 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From mchevalier at openjdk.org Fri May 2 08:43:23 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Fri, 2 May 2025 08:43:23 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs [v2] In-Reply-To: References: Message-ID: > Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. > > The unmentioned ones: > - `ccp` > - `ciReplay` > - `ciTypeFlow` > - `compilercontrol` > - `debug` > - `oracle` > - `predicates` > - `print` > - `relocations` > - `sharedstubs` > - `splitif` > - `tiered` > - `whitebox` > > And those, that are not test folders: > - `lib` > - `patches` > - `testlibraries` > > I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. > > The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. > > Feel free to tell if other folders should be included (and in which tier). > > Thanks, > Marc Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: speed up slowest test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24817/files - new: https://git.openjdk.org/jdk/pull/24817/files/7918a832..3232e5b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24817&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24817&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24817.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24817/head:pull/24817 PR: https://git.openjdk.org/jdk/pull/24817 From mchevalier at openjdk.org Fri May 2 08:43:24 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Fri, 2 May 2025 08:43:24 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 08:44:04 GMT, Marc Chevalier wrote: > Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. > > The unmentioned ones: > - `ccp` > - `ciReplay` > - `ciTypeFlow` > - `compilercontrol` > - `debug` > - `oracle` > - `predicates` > - `print` > - `relocations` > - `sharedstubs` > - `splitif` > - `tiered` > - `whitebox` > > And those, that are not test folders: > - `lib` > - `patches` > - `testlibraries` > > I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. > > The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. > > Feel free to tell if other folders should be included (and in which tier). > > Thanks, > Marc I've pushed the suggested change. The test still passes, longest result was 12s, from 40s without fix, 8s with my more radical fix: so it's still a big improvement. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24817#issuecomment-2846675598 From mli at openjdk.org Fri May 2 09:11:47 2025 From: mli at openjdk.org (Hamlin Li) Date: Fri, 2 May 2025 09:11:47 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> Message-ID: On Thu, 1 May 2025 18:27:24 GMT, Vladimir Ivanov wrote: > I want to understand the issue with missing entries in vector math native libraries first before making a decision how to proceed. > > > I don't think it's a build issue, the jdk vendor can choose to support it or not, it's just the way passing the information ( whether it's supported or not) are different. > > Sorry, I don't get it. How does it affect the contents of the native library? JDK vendors do have an option to bundle the library or drop it from their distribution. But when SLEEF-based and SVML math libraries are built by JDK there's no distinction between entries being included in the library. If a vendor modifies make files or native library code, it's up to them to adjust JDK accordingly. Upstream JDK doesn't have to take such scenarios into account. Sorry for confusing you. No, what I mean is vendors can choose their build environment (e.g. with/without support of compiler, I discuss this a bit below), and different environment will lead to whether sleef is supported or not (i.e. whether there are entries in the libsleef.so). > > I looked through SVML and SLEEF-based vector math code and noticed there are some capability check [1] [2] [3] guarding library code. It means that if some library entry is missing, then the whole library is empty. Can you confirm it's the case you see? > > [1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/linux/native/libjsvml/globals_vectorApiSupport_linux.S.inc#L35 [2] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/windows/native/libjsvml/globals_vectorApiSupport_windows.S.inc#L28 [3] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/unix/native/libsleef/lib/vector_math_rvv.c#L36 In riscv sleef case, yes, it checks native compiler version and a flat, this is required because only these versions (or higher) support vector intrinsics which are used in sleef header files. ( I think arm is the same case, but in a bit simpler way only with a flag). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2846731509 From duke at openjdk.org Fri May 2 10:01:27 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 2 May 2025 10:01:27 GMT Subject: RFR: 8354969: Add strdup function for ResourceArea Message-ID: <7RWd7cVKqTbDkFyVdiLyHLFIUAwiSOMipKzGny-QRH8=.5c55c562-3742-4dd8-9131-73c5140cdf86@github.com> Added a strdup() method, as requested by the bug reporter. The method is added to Arena, but also available in ResourceArea, as requested. A test for the method is provided. Testing: tiers 1-3 on multiple platforms. ------------- Commit messages: - 8354969: Added strdup function for Arena and ResourceArea Changes: https://git.openjdk.org/jdk/pull/24998/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24998&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354969 Stats: 17 lines in 2 files changed: 17 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24998.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24998/head:pull/24998 PR: https://git.openjdk.org/jdk/pull/24998 From ayang at openjdk.org Fri May 2 10:28:55 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 2 May 2025 10:28:55 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics Message-ID: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> This patch refines Parallel's sizing strategy to improve overall memory management and performance. The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. ## Performance evaluation - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. PS: I have opportunistically set the obsolete/expired version to 25/26 for now. I will update them accordingly before merging. Test: tier1-8 ------------- Commit messages: - pgc-size-policy Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8338977 Stats: 4365 lines in 29 files changed: 521 ins; 3446 del; 398 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From jbechberger at openjdk.org Fri May 2 10:33:04 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Fri, 2 May 2025 10:33:04 GMT Subject: RFR: 8342818: Implement CPU Time Profiling for JFR [v43] In-Reply-To: References: <12EY0qQHtcU6A5z5VstORM7kibUWrqQNtIGfC4tqvoI=.798f782f-fa25-4640-9f92-5c77030ed2ec@github.com> <2QZmICnWpVbuCTC49cQscg0MwiCGo66CHH0enxPVT68=.605270ed-a520-453d-be26-0240c70e4c3b@github.com> Message-ID: On Wed, 30 Apr 2025 12:59:28 GMT, Johannes Bechberger wrote: >> I probably found the issue: `_active_signal_handlers` is never set to zero. > > I fixed it. And it was probably the same issue that kept me from fixing the version that doesn't depend on your PR ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2071419362 From jbechberger at openjdk.org Fri May 2 10:42:11 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Fri, 2 May 2025 10:42:11 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v44] In-Reply-To: References: Message-ID: > This is the code for the [JEP draft: CPU Time based profiling for JFR]. > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). > > A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: Implement CPU time sampler that emits events at safepoints The stacktraces are recorded in the signal handler, but the JFR events are only created at safepoints (except when a thread is in native too long). ------------- Changes: https://git.openjdk.org/jdk/pull/20752/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=43 Stats: 2667 lines in 53 files changed: 2492 ins; 147 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/20752.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20752/head:pull/20752 PR: https://git.openjdk.org/jdk/pull/20752 From jbechberger at openjdk.org Fri May 2 10:42:11 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Fri, 2 May 2025 10:42:11 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v43] In-Reply-To: <12EY0qQHtcU6A5z5VstORM7kibUWrqQNtIGfC4tqvoI=.798f782f-fa25-4640-9f92-5c77030ed2ec@github.com> References: <12EY0qQHtcU6A5z5VstORM7kibUWrqQNtIGfC4tqvoI=.798f782f-fa25-4640-9f92-5c77030ed2ec@github.com> Message-ID: On Mon, 17 Mar 2025 10:03:00 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP draft: CPU Time based profiling for JFR]. >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). >> >> A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Improve placement of NoResourceMark > - Add more checks for metadata_do The new version is a partial rewrite. The sampler emits all JFR events at safepoints or while a thread is in native. This prevents any method or thread in the stacktraces to be unloaded. This technique is a cross between the previous implementation and the implementation based on the cooperative sampling JEP (#24296). ------------- PR Comment: https://git.openjdk.org/jdk/pull/20752#issuecomment-2846901854 From mli at openjdk.org Fri May 2 10:48:50 2025 From: mli at openjdk.org (Hamlin Li) Date: Fri, 2 May 2025 10:48:50 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v4] In-Reply-To: References: Message-ID: <-bAHBy4AmqSglDpT2t94FrSQ7n1oFPkBDuxfcd2C0A8=.802dff01-b8da-439e-8b80-aa5dddc4031c@github.com> On Wed, 30 Apr 2025 12:00:45 GMT, Robbin Ehn wrote: >> Hi, for you to consider. >> >> These tests constantly fails in qemu-user. >> Either the require host to be same arch explicit or implicit (sysroot). >> E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. >> >> From bug: >>> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >>> We add this uarch to CPU feature string. >>> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. >> >> Relevant qemu code: >> https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 >> >> Relevant hotspot code: >> https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 >> >> Tested that the require only filters out tests in qemu+riscv64. >> >> Thanks! >> >> /Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - Revert > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - more > - more > - native or very long Looks good, thanks for fixing this and discussing! ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24229#pullrequestreview-2811707177 From coleenp at openjdk.org Fri May 2 11:26:50 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 2 May 2025 11:26:50 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:44:04 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms For the record, I'm reviewing and testing this PR also. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2846985137 From coleenp at openjdk.org Fri May 2 11:26:51 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 2 May 2025 11:26:51 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Wed, 30 Apr 2025 20:15:36 GMT, Frederic Parain wrote: >> src/hotspot/share/oops/fieldInfo.hpp line 290: >> >>> 288: static int compare_symbols(const Symbol *s1, const Symbol *s2); >>> 289: >>> 290: static Array* create_FieldInfoStream(ConstantPool* constants, GrowableArray* fields, int java_fields, int injected_fields, >> >> In the latest form the ConstantPool parameter is used only for assertion, though I think that it is rather important assertion. > > The ConstantPool parameter can be limited to debug builds (the ones with asserts) with the following patch: > > > diff --git a/src/hotspot/share/classfile/classFileParser.cpp b/src/hotspot/share/classfile/classFileParser.cpp > index 48646c0fb83..19471bbf7ee 100644 > --- a/src/hotspot/share/classfile/classFileParser.cpp > +++ b/src/hotspot/share/classfile/classFileParser.cpp > @@ -5813,7 +5813,7 @@ void ClassFileParser::post_process_parsed_stream(const ClassFileStream* const st > > int injected_fields_count = _temp_field_info->length() - _java_fields_count; > _fieldinfo_stream = > - FieldInfoStream::create_FieldInfoStream(_cp, _temp_field_info, _java_fields_count, > + FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(_cp COMMA) _temp_field_info, _java_fields_count, > injected_fields_count, loader_data(), CHECK); > _fields_status = > MetadataFactory::new_array(_loader_data, _temp_field_info->length(), > diff --git a/src/hotspot/share/classfile/javaClasses.cpp b/src/hotspot/share/classfile/javaClasses.cpp > index f3fdf28b96b..80ee179576c 100644 > --- a/src/hotspot/share/classfile/javaClasses.cpp > +++ b/src/hotspot/share/classfile/javaClasses.cpp > @@ -963,7 +963,7 @@ void java_lang_Class::fixup_mirror(Klass* k, TRAPS) { > } > Array* old_stream = ik->fieldinfo_stream(); > assert(fields->length() == (java_fields + injected_fields), "Must be"); > - Array* new_fis = FieldInfoStream::create_FieldInfoStream(ik->constants(), fields, java_fields, injected_fields, k->class_loader_data(), CHECK); > + Array* new_fis = FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(ik->constants() COMMA) fields, java_fields, injected_fields, k->class_loader_data(), CHECK); > ik->set_fieldinfo_stream(new_fis); > MetadataFactory::free_array(k->class_loader_data(), old_stream); > } > diff --git a/src/hotspot/share/oops/fieldInfo.cpp b/src/hotspot/share/oops/fieldInfo.cpp > index dd1fa11042d..eb90e6bdae8 100644 > --- a/src/hotspot/share/oops/fieldInfo.cpp > +++ b/src/hotspot/share/oops/fieldInfo.cpp > @@ -66,7 +66,7 @@ int FieldInfoStream::compare_symbols(const Symbol *s1, const Symbol *s2) { > } > } > > -Array* FieldInfoStream::create_FieldInfoStream(ConstantPool* constants, GrowableArray* fields, int java_fields, int injected_fields, > +Array* FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(ConstantPool* constants COMMA) GrowableArray* fields, int java_fields, int injected_f... I don't think adding DEBUG_ONLY optimizes anything and kind of looks messy, I don't like this change unless the product compiler complains about an unused parameter. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2071473593 From coleenp at openjdk.org Fri May 2 11:42:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 2 May 2025 11:42:52 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v4] In-Reply-To: <9fsnmTdCxPkiaroCOh8qx0y0yAydVvtvdj08Wt-oMT8=.a4896930-7fdd-426b-bd0e-bb48479ceba3@github.com> References: <9fsnmTdCxPkiaroCOh8qx0y0yAydVvtvdj08Wt-oMT8=.a4896930-7fdd-426b-bd0e-bb48479ceba3@github.com> Message-ID: On Wed, 30 Apr 2025 20:39:30 GMT, Coleen Phillimore wrote: >> Use LockFreeStack to link events on the eventLog queue. They are never popped so this requires no further synchronization. >> Tested by tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > remove memory order specification Thank you for reviewing and commenting, Aleksey, Zhengyu, Leonid and Martin. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24954#issuecomment-2847009156 From coleenp at openjdk.org Fri May 2 11:42:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 2 May 2025 11:42:53 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> Message-ID: On Fri, 2 May 2025 08:15:36 GMT, Martin Doerr wrote: >> I saw no point in enforcing memory ordering mode here, as it looks like we only did `ThreadCritical` for mutual exclusion. Note that we do not have a matching acquire on list traversals, so seqcst/release on list additions would be incomplete. That only reinforces my original thinking: we are riding on memory ordering given by something else, I'd guess the initialization sequence itself. >> >> But I won't quibble, it is a very minor optimization. > > I guess we rely on memory ordering by address dependency on the reader's side. Yes we only used ThreadCritical to add to the list after the VM becomes multithreaded. The list traversals (reads) are for hs_err_pid file printing, which presumably is single threaded at that point and presumably nothing is adding to the list. I added Atomic::load() but I think it's not necessary to be a load_acquire. There are only about 10 items max on this list so far, so I think performance isn't a concern either way. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2071489248 From coleenp at openjdk.org Fri May 2 11:42:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 2 May 2025 11:42:54 GMT Subject: Integrated: 8355627: Don't use ThreadCritical for EventLog list In-Reply-To: References: Message-ID: <7_AJb63I55_DmUy53MxAc22BAYtGZMMAl3tj9XSYI8s=.e35ede06-86a6-45a3-9de6-e49d6e67ebef@github.com> On Tue, 29 Apr 2025 19:16:48 GMT, Coleen Phillimore wrote: > Use LockFreeStack to link events on the eventLog queue. They are never popped so this requires no further synchronization. > Tested by tier1-4. This pull request has now been integrated. Changeset: afb9134a Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/afb9134a31c326e90f2bb68ae17e32de9d1d7740 Stats: 91 lines in 2 files changed: 80 ins; 2 del; 9 mod 8355627: Don't use ThreadCritical for EventLog list Reviewed-by: shade, lmesnik, zgu ------------- PR: https://git.openjdk.org/jdk/pull/24954 From aph at openjdk.org Fri May 2 11:43:47 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 2 May 2025 11:43:47 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> <3BMQiQtyRXIj-NFUoFPliNYV4r1nX3KpKgniMvtOMkc=.cdcd240e-ce85-46c7-9bfb-e5be5124aae9@github.com> <1hfhvGjxKFAYEtj1D_pIdgU659AE2oPWoQEyXl8sRgQ=.3aa62617-142a-49c9-82c4-0f761cb73aff@github.com> Message-ID: <_Zjv0l0jFcm3LyXV8aU8IdBDIiXTGatgRfV5BEv6_Fc=.1b5646f8-f905-45e8-9d7f-aec437edfb84@github.com> On Thu, 1 May 2025 08:50:27 GMT, Hamlin Li wrote: > > Overall, it still looks like a JDK build issue to me. Hiding problems occurred during the build is not good. If some toolchains can't successfully build the library, the library shouldn't be included in JDK. > > No, in riscv case (possiblely also on arm?) I don't think it's a build issue, the jdk vendor can choose to support it or not, it's just the way passing the information ( whether it's supported or not) are different. I am not convinced that supporting such a divergence between builds of the JDK is something we should support. Sure, we can choose at runtime whether to link with SLEEF or not, but having some builds of (say) risc OpenJDK with SLEEF and some without is a Bad Thing. It is a build issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2847012173 From sroy at openjdk.org Fri May 2 11:58:53 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 2 May 2025 11:58:53 GMT Subject: RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v31] In-Reply-To: References: <2cIptfLHrdxSy0t7RdsRlde94arK3gmqge9AiXmOZeo=.069a496c-e9dd-40cd-a144-306a65df0e1a@github.com> <68pnxMO83zNvBGORMQQgTyMXBf0m7b8nCwVaY2jfNnQ=.bcff79ab-0288-4517-ad62-e183f96e1c3f@github.com> Message-ID: On Mon, 28 Apr 2025 09:21:50 GMT, Andrew Haley wrote: >> Please run AESGCMByteBuffer.encrypt and provide some before and after figures. > >> @theRealAph From my end, we had improvement of around 3 times after running TestAESMain. Is that not valid test suite ? >> >> If the improvement with this version is satisfactory , can we have this integrated and then pursue further improvements on it in separate PR ? Will open a JBS issue for the same > > You should run the JMH test, like so: > > > fedora:theRealAph-jdk $ CONF=release make -k LOG=info build-microbenchmark CONF_CHECK=auto > fedora:theRealAph-jdk $ ./build/linux-aarch64-server-release/jdk/bin/java -Djmh.ignoreLock=true -jar ./build/linux-aarch64-server-release/images/test/micro/benchmarks.jar -f 1 AESGCMByteBuffer.encrypt$ > > Benchmark (dataMethod) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 2216504.066 ? 12527.173 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1500 128 thrpt 8 1505300.797 ? 8675.648 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 4096 128 thrpt 8 813518.431 ? 7513.509 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 16384 128 thrpt 8 233268.190 ? 975.616 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 2562063.056 ? 18200.538 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1500 128 thrpt 8 1771049.922 ? 6444.924 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 4096 128 thrpt 8 934138.960 ? 5353.664 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 16384 128 thrpt 8 257884.039 ? 149.974 ops/s > o.o.b.j.c.small.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 2214159.143 ? 16196.670 ops/s > o.o.b.j.c.small.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 2578675.681 ? 22067.812 ops/s Hi @theRealAph Can you help understand this result ? since op/s is increasing for ghash code , does it suggest a speedup ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2847036728 From aph at openjdk.org Fri May 2 12:07:51 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 2 May 2025 12:07:51 GMT Subject: RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v31] In-Reply-To: References: <2cIptfLHrdxSy0t7RdsRlde94arK3gmqge9AiXmOZeo=.069a496c-e9dd-40cd-a144-306a65df0e1a@github.com> <68pnxMO83zNvBGORMQQgTyMXBf0m7b8nCwVaY2jfNnQ=.bcff79ab-0288-4517-ad62-e183f96e1c3f@github.com> Message-ID: On Mon, 28 Apr 2025 09:21:50 GMT, Andrew Haley wrote: >> Please run AESGCMByteBuffer.encrypt and provide some before and after figures. > >> @theRealAph From my end, we had improvement of around 3 times after running TestAESMain. Is that not valid test suite ? >> >> If the improvement with this version is satisfactory , can we have this integrated and then pursue further improvements on it in separate PR ? Will open a JBS issue for the same > > You should run the JMH test, like so: > > > fedora:theRealAph-jdk $ CONF=release make -k LOG=info build-microbenchmark CONF_CHECK=auto > fedora:theRealAph-jdk $ ./build/linux-aarch64-server-release/jdk/bin/java -Djmh.ignoreLock=true -jar ./build/linux-aarch64-server-release/images/test/micro/benchmarks.jar -f 1 AESGCMByteBuffer.encrypt$ > > Benchmark (dataMethod) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 2216504.066 ? 12527.173 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1500 128 thrpt 8 1505300.797 ? 8675.648 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 4096 128 thrpt 8 813518.431 ? 7513.509 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 16384 128 thrpt 8 233268.190 ? 975.616 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 2562063.056 ? 18200.538 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1500 128 thrpt 8 1771049.922 ? 6444.924 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 4096 128 thrpt 8 934138.960 ? 5353.664 ops/s > o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 16384 128 thrpt 8 257884.039 ? 149.974 ops/s > o.o.b.j.c.small.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 2214159.143 ? 16196.670 ops/s > o.o.b.j.c.small.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 2578675.681 ? 22067.812 ops/s > Hi @theRealAph Can you help understand this result ? since op/s is increasing for ghash code , does it suggest a speedup ? Yes, an increase in ops/s is what we want. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2847051951 From rehn at openjdk.org Fri May 2 12:14:50 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 2 May 2025 12:14:50 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v2] In-Reply-To: References: Message-ID: On Thu, 27 Mar 2025 17:57:37 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'master' into qemu-user-issues >> - more >> - more >> - native or very long > > I also feel annoying to see some tests fail interminently. > > Not sure if I understand the goal of this pr, seems it might not be the best solution to simply disable these tests when running with qemu. My concerns are: qemu is still one of main methods to quickly verify the functionality changes, but when we just disable the failed tests, and maybe in the future disable more and more tests, then qemu is no longer able to play the role it was supposed to play. Thanks @Hamlin-Li and @RealFYang. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2847064204 From sroy at openjdk.org Fri May 2 12:16:16 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 2 May 2025 12:16:16 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v6] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 21:48:42 GMT, Martin Doerr wrote: >> Suchismith Roy has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing space >> - trailing space > > src/hotspot/cpu/ppc/assembler_ppc.inline.hpp line 703: > >> 701: inline void Assembler::ldarx_unchecked(Register d, Register a, Register b, int eh1) { emit_int32( LDARX_OPCODE | rt(d) | ra0mem(a) | rb(b) | eh(eh1)); } >> 702: inline void Assembler::lqarx_unchecked(Register d, Register a, Register b, int eh1) { emit_int32( LQARX_OPCODE | rt(d) | ra0mem(a) | rb(b) | eh(eh1)); } >> 703: inline bool Assembler::lxarx_hint_exclusive_access() { return true; } > > Should better be removed completely. Hi @TheRealMDoerr hint_exclusive_access flag should be set to true ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20262#discussion_r2071525512 From mdoerr at openjdk.org Fri May 2 12:21:06 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 2 May 2025 12:21:06 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v6] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 12:13:10 GMT, Suchismith Roy wrote: >> src/hotspot/cpu/ppc/assembler_ppc.inline.hpp line 703: >> >>> 701: inline void Assembler::ldarx_unchecked(Register d, Register a, Register b, int eh1) { emit_int32( LDARX_OPCODE | rt(d) | ra0mem(a) | rb(b) | eh(eh1)); } >>> 702: inline void Assembler::lqarx_unchecked(Register d, Register a, Register b, int eh1) { emit_int32( LQARX_OPCODE | rt(d) | ra0mem(a) | rb(b) | eh(eh1)); } >>> 703: inline bool Assembler::lxarx_hint_exclusive_access() { return true; } >> >> Should better be removed completely. > > Hi @TheRealMDoerr hint_exclusive_access flag should be set to true ? All `&& lxarx_hint_exclusive_access()` usages should be removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20262#discussion_r2071531408 From sroy at openjdk.org Fri May 2 12:29:54 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 2 May 2025 12:29:54 GMT Subject: RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v31] In-Reply-To: <68pnxMO83zNvBGORMQQgTyMXBf0m7b8nCwVaY2jfNnQ=.bcff79ab-0288-4517-ad62-e183f96e1c3f@github.com> References: <2cIptfLHrdxSy0t7RdsRlde94arK3gmqge9AiXmOZeo=.069a496c-e9dd-40cd-a144-306a65df0e1a@github.com> <68pnxMO83zNvBGORMQQgTyMXBf0m7b8nCwVaY2jfNnQ=.bcff79ab-0288-4517-ad62-e183f96e1c3f@github.com> Message-ID: On Thu, 24 Apr 2025 14:13:50 GMT, Suchismith Roy wrote: >> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437) >> >> Currently acceleration code for GHASH is missing for PPC64. >> >> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > masm Thank you everyone. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2847089091 From duke at openjdk.org Fri May 2 12:29:55 2025 From: duke at openjdk.org (duke) Date: Fri, 2 May 2025 12:29:55 GMT Subject: RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v31] In-Reply-To: <68pnxMO83zNvBGORMQQgTyMXBf0m7b8nCwVaY2jfNnQ=.bcff79ab-0288-4517-ad62-e183f96e1c3f@github.com> References: <2cIptfLHrdxSy0t7RdsRlde94arK3gmqge9AiXmOZeo=.069a496c-e9dd-40cd-a144-306a65df0e1a@github.com> <68pnxMO83zNvBGORMQQgTyMXBf0m7b8nCwVaY2jfNnQ=.bcff79ab-0288-4517-ad62-e183f96e1c3f@github.com> Message-ID: On Thu, 24 Apr 2025 14:13:50 GMT, Suchismith Roy wrote: >> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437) >> >> Currently acceleration code for GHASH is missing for PPC64. >> >> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > masm @suchismith1993 Your change (at version 423c8685dad48509afdeb46611585a26317bc130) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2847089935 From mdoerr at openjdk.org Fri May 2 12:32:31 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 2 May 2025 12:32:31 GMT Subject: RFR: 8351666: [PPC64] Make non-volatile VectorRegisters available for C2 register allocation [v24] In-Reply-To: References: Message-ID: > This PR makes the non-volatile VectorRegisters available for C2's register allocation. > > I had to implement the VectorRegisters properly (4 VM Regs) like on other platforms. The old version has run into assertions and looked strange. > > The non-volatile VectorRegisters are now saved when entering Java: call_stub and upcall_stubs. > I have rewritten the save and restore functions and used them for both. Then, I have removed code which has become dead. I only save and restore them if C2 uses the vector instructions (controlled by `SuperwordUseVSX`). > I have moved the non-volatile spill area out of the entry_frame_locals because it has a variable size, now. > > The stack area for all non-volatile registers has become larger than the 288 Bytes which are allowed to be used below the SP (specified by the ABI). Therefore, I had to rewrite the call_stub sequence significantly. We need to push the new frame before saving the registers, now. > > Saving and restoring the FP registers is not needed in the slow signature handler which also uses the save and restore code for non-volatile registers. > > On Power10, we use vector pair instructions since Commit 8. E.g. in the call stub: > > 0x000072c9483c07b4: stxvp vs52,-224(r2) > 0x000072c9483c07b8: stxvp vs54,-192(r2) > 0x000072c9483c07bc: stxvp vs56,-160(r2) > 0x000072c9483c07c0: stxvp vs58,-128(r2) > 0x000072c9483c07c4: stxvp vs60,-96(r2) > 0x000072c9483c07c8: stxvp vs62,-64(r2) > > > > 0x000072c9483c0914: lxvp vs52,-224(r2) > 0x000072c9483c0918: lxvp vs54,-192(r2) > 0x000072c9483c091c: lxvp vs56,-160(r2) > 0x000072c9483c0920: lxvp vs58,-128(r2) > 0x000072c9483c0924: lxvp vs60,-96(r2) > 0x000072c9483c0928: lxvp vs62,-64(r2) Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 35 additional commits since the last revision: - Improve readability of LXVX_OPCODE. - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs - Add comment and clean up extra whitespaces. - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs - Make order of MachRegisters consistent with ConcreteRegisterImpl and simplify rc_class. - Rewrite rc_class avoiding hard coded register numbers. - Add comment regarding 8-Byte aligned stack slots for VecX. - Add missing VSRs to RegisterSaver_LiveVSReg. - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs - ... and 25 more: https://git.openjdk.org/jdk/compare/880e23ed...e3cc8bcd ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23987/files - new: https://git.openjdk.org/jdk/pull/23987/files/7155b82c..e3cc8bcd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23987&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23987&range=22-23 Stats: 13908 lines in 401 files changed: 10622 ins; 1966 del; 1320 mod Patch: https://git.openjdk.org/jdk/pull/23987.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23987/head:pull/23987 PR: https://git.openjdk.org/jdk/pull/23987 From sroy at openjdk.org Fri May 2 12:34:09 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 2 May 2025 12:34:09 GMT Subject: Integrated: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm In-Reply-To: <2cIptfLHrdxSy0t7RdsRlde94arK3gmqge9AiXmOZeo=.069a496c-e9dd-40cd-a144-306a65df0e1a@github.com> References: <2cIptfLHrdxSy0t7RdsRlde94arK3gmqge9AiXmOZeo=.069a496c-e9dd-40cd-a144-306a65df0e1a@github.com> Message-ID: On Thu, 18 Jul 2024 14:31:57 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437) > > Currently acceleration code for GHASH is missing for PPC64. > > The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result. This pull request has now been integrated. Changeset: cdad6d78 Author: Suchismith Roy Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/cdad6d788de4785c8dbf2710a86fdacb8d070565 Stats: 183 lines in 2 files changed: 181 ins; 0 del; 2 mod 8216437: PPC64: Add intrinsic for GHASH algorithm Reviewed-by: mdoerr, amitkumar ------------- PR: https://git.openjdk.org/jdk/pull/20235 From sroy at openjdk.org Fri May 2 12:36:33 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 2 May 2025 12:36:33 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v7] In-Reply-To: References: Message-ID: > JBS Issue: [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859) > Linux PPC64le requires Power8 since the beginning. > AIX requires Power8 with the new OpenXL based build ([JDK-8307520](https://bugs.openjdk.org/browse/JDK-8307520)). The old build has been removed in JDK 23 ([JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701)). > Linux PPC64 Big Endian is no longer officially supported (only kept alive for development, debugging and testing purposes). > > The following checks for old processors are no longer needed: > 8: VM_Version::has_lqarx() > 7: VM_Version::has_popcntw() > 6: VM_Version::has_cmpb() > 5: VM_Version::has_popcntb() > These ones and some more checks for old instructions are no longer needed. All code which is no longer reachable when removing them should also get removed. > Checks like "PowerArchitecturePPC64 >= 8" (or older) can be removed. > > Atomic::PlatformCmpxchg<1>::operator() can be simplified by using sub-word instructions (lharx, lbarx). > > Temp registers can be removed from cmpxchgb and cmpxchgh. > > Build flags "-mcpu=powerpc64 -mtune=power5" for Big Endian linux should get replaced by "-mcpu=power8 -mtune=power8" as already used for linux PPC64le. Suchismith Roy has updated the pull request incrementally with two additional commits since the last revision: - change assembler files - review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20262/files - new: https://git.openjdk.org/jdk/pull/20262/files/a95958a7..297394ab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20262&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20262&range=05-06 Stats: 16 lines in 4 files changed: 1 ins; 4 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/20262.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20262/head:pull/20262 PR: https://git.openjdk.org/jdk/pull/20262 From jbechberger at openjdk.org Fri May 2 12:37:08 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Fri, 2 May 2025 12:37:08 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> > This is the code for the [JEP draft: CPU Time based profiling for JFR]. > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). > > A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Remove assertions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20752/files - new: https://git.openjdk.org/jdk/pull/20752/files/9ffe6232..eb3ab54e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=44 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=43-44 Stats: 4 lines in 4 files changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20752.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20752/head:pull/20752 PR: https://git.openjdk.org/jdk/pull/20752 From sroy at openjdk.org Fri May 2 12:40:13 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 2 May 2025 12:40:13 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v4] In-Reply-To: References: <8GpiUAAXg5g66PsOlWGGPlVBcwhDJgoPZuj9hpIbXV8=.c67acdcd-857b-48ab-b8cb-e7a37d34095c@github.com> Message-ID: <9I62dT52E6tYKFxoejvb5Ydg6Q1oU1Y2G2JMXBN-prg=.1c85cab1-37fc-4f98-a552-a4fbb41491ac@github.com> On Tue, 29 Apr 2025 14:50:52 GMT, Suchismith Roy wrote: >> Suchismith Roy has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: >> >> - Merge branch 'openjdk:master' into power8 >> - mfdscr removal >> - indents >> - indents >> - mcpu flag >> - superword >> - further cleanup >> - clean of power 7 instructions >> - Removal of older P7 Instructions >> - Removal of older P7 Instructions >> - ... and 17 more: https://git.openjdk.org/jdk/compare/1138a186...4aa520c2 > > src/hotspot/cpu/ppc/ppc.ad line 10371: > >> 10369: ins_cost(DEFAULT_COST); >> 10370: >> 10371: expand %{ > > HI @TheRealMDoerr > I tried removing moveD2L_reg_stack as it is the only usage in ad file. > But the build fails . Any exception to this rule ? Hi @TheRealMDoerr could you explain this ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20262#discussion_r2071552158 From duke at openjdk.org Fri May 2 12:46:50 2025 From: duke at openjdk.org (David Linus Briemann) Date: Fri, 2 May 2025 12:46:50 GMT Subject: RFR: 8351666: [PPC64] Make non-volatile VectorRegisters available for C2 register allocation [v24] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 12:32:31 GMT, Martin Doerr wrote: >> This PR makes the non-volatile VectorRegisters available for C2's register allocation. >> >> I had to implement the VectorRegisters properly (4 VM Regs) like on other platforms. The old version has run into assertions and looked strange. >> >> The non-volatile VectorRegisters are now saved when entering Java: call_stub and upcall_stubs. >> I have rewritten the save and restore functions and used them for both. Then, I have removed code which has become dead. I only save and restore them if C2 uses the vector instructions (controlled by `SuperwordUseVSX`). >> I have moved the non-volatile spill area out of the entry_frame_locals because it has a variable size, now. >> >> The stack area for all non-volatile registers has become larger than the 288 Bytes which are allowed to be used below the SP (specified by the ABI). Therefore, I had to rewrite the call_stub sequence significantly. We need to push the new frame before saving the registers, now. >> >> Saving and restoring the FP registers is not needed in the slow signature handler which also uses the save and restore code for non-volatile registers. >> >> On Power10, we use vector pair instructions since Commit 8. E.g. in the call stub: >> >> 0x000072c9483c07b4: stxvp vs52,-224(r2) >> 0x000072c9483c07b8: stxvp vs54,-192(r2) >> 0x000072c9483c07bc: stxvp vs56,-160(r2) >> 0x000072c9483c07c0: stxvp vs58,-128(r2) >> 0x000072c9483c07c4: stxvp vs60,-96(r2) >> 0x000072c9483c07c8: stxvp vs62,-64(r2) >> >> >> >> 0x000072c9483c0914: lxvp vs52,-224(r2) >> 0x000072c9483c0918: lxvp vs54,-192(r2) >> 0x000072c9483c091c: lxvp vs56,-160(r2) >> 0x000072c9483c0920: lxvp vs58,-128(r2) >> 0x000072c9483c0924: lxvp vs60,-96(r2) >> 0x000072c9483c0928: lxvp vs62,-64(r2) > > Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 35 additional commits since the last revision: > > - Improve readability of LXVX_OPCODE. > - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs > - Add comment and clean up extra whitespaces. > - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs > - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs > - Make order of MachRegisters consistent with ConcreteRegisterImpl and simplify rc_class. > - Rewrite rc_class avoiding hard coded register numbers. > - Add comment regarding 8-Byte aligned stack slots for VecX. > - Add missing VSRs to RegisterSaver_LiveVSReg. > - Merge remote-tracking branch 'origin' into 8351666_PPC64_nv_VRs > - ... and 25 more: https://git.openjdk.org/jdk/compare/827689bc...e3cc8bcd I reviewed and verified the opcodes and the names of registers (checked for typos etc). The more involved parts I leave to the experienced reviewers. LGTM ------------- Marked as reviewed by dbriemann at github.com (no known OpenJDK username). PR Review: https://git.openjdk.org/jdk/pull/23987#pullrequestreview-2811924251 From kvn at openjdk.org Fri May 2 15:33:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 2 May 2025 15:33:46 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs [v2] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 08:43:23 GMT, Marc Chevalier wrote: >> Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. >> >> The unmentioned ones: >> - `ccp` >> - `ciReplay` >> - `ciTypeFlow` >> - `compilercontrol` >> - `debug` >> - `oracle` >> - `predicates` >> - `print` >> - `relocations` >> - `sharedstubs` >> - `splitif` >> - `tiered` >> - `whitebox` >> >> And those, that are not test folders: >> - `lib` >> - `patches` >> - `testlibraries` >> >> I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. >> >> The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. >> >> Feel free to tell if other folders should be included (and in which tier). >> >> Thanks, >> Marc > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > speed up slowest test Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24817#pullrequestreview-2812307636 From mdoerr at openjdk.org Fri May 2 15:56:09 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 2 May 2025 15:56:09 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v4] In-Reply-To: <9I62dT52E6tYKFxoejvb5Ydg6Q1oU1Y2G2JMXBN-prg=.1c85cab1-37fc-4f98-a552-a4fbb41491ac@github.com> References: <8GpiUAAXg5g66PsOlWGGPlVBcwhDJgoPZuj9hpIbXV8=.c67acdcd-857b-48ab-b8cb-e7a37d34095c@github.com> <9I62dT52E6tYKFxoejvb5Ydg6Q1oU1Y2G2JMXBN-prg=.1c85cab1-37fc-4f98-a552-a4fbb41491ac@github.com> Message-ID: On Fri, 2 May 2025 12:36:56 GMT, Suchismith Roy wrote: >> src/hotspot/cpu/ppc/ppc.ad line 10371: >> >>> 10369: ins_cost(DEFAULT_COST); >>> 10370: >>> 10371: expand %{ >> >> HI @TheRealMDoerr >> I tried removing moveD2L_reg_stack as it is the only usage in ad file. >> But the build fails . Any exception to this rule ? > > Hi @TheRealMDoerr could you explain this ? `moveD2L_reg_stack` is the only instruct which matches `Set dst (MoveD2L src)` where `dst` is a `stackSlotL` and `src` is a `regD`. If you remove it, C2 can't match it any more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20262#discussion_r2071818806 From zgu at openjdk.org Fri May 2 16:07:00 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Fri, 2 May 2025 16:07:00 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Fri, 2 May 2025 12:37:08 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP draft: CPU Time based profiling for JFR]. >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). >> >> A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove assertions Changes requested by zgu (Reviewer). src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 66: > 64: } > 65: } while (Atomic::cmpxchg(&_head, elementIndex, elementIndex + 1) != elementIndex); > 66: _data[elementIndex] = element; I think you need `release_store` here at least. But I am questioning the correctness of the implementation. Consider following scenario: T1: equeue(): after complete CAS successfully, e.g. `_head = 3, elementIndex = 2` T2: dequeue(): after complete CAS successfully, `_head = 2, elementIndex = 3` T2: read `_data[--elementIndex]` // _data[2] has yet set T1: write `_data[elementIndex] = element` // set _data[2] value src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 75: > 73: elementIndex = Atomic::load_acquire(&_head); > 74: if (elementIndex == 0) { > 75: return NULL; Use `nullptr` instead. src/hotspot/share/runtime/handshake.hpp line 133: > 131: > 132: bool can_run(); > 133: bool can_run(bool allow_suspend, bool check_async_exception); You may want to inline these three methods, looks like that they can be on hot pathes. ------------- PR Review: https://git.openjdk.org/jdk/pull/20752#pullrequestreview-2812012324 PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2071810635 PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2071793767 PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2071612855 From asmehra at openjdk.org Fri May 2 16:30:50 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 2 May 2025 16:30:50 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> References: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> Message-ID: On Thu, 1 May 2025 19:19:32 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Test TestDuplicatedLateInliningOutput.java While you are refactoring this, if you can also add a title to the `PrintCompilation` output. like the following, it would immensely helpful as it would avoid the need to dig up the code to understand what these numbers stand for. t=elapsed time since JVM start W=time spent in waiting to be put on compilation queue Q=time spent in the compilation queue C=time taken to compile Id=Compilation Id Lvl=Compilation level t W Q C Id Lvl 344 0 1351 3 java.util.Arrays::hashCode (15 bytes) started 345 0 1350 3 java.util.Objects::hash (5 bytes) started 345 0 1352 3 jdk.internal.util.ArraysSupport::hashCode (42 bytes) started 345 0 37 79 1349 3 java.lang.Byte:: (10 bytes) 345 0 1353 3 java.lang.Byte::hashCode (8 bytes) started 345 0 77 92 1351 3 java.util.Arrays::hashCode (15 bytes) 345 0 1354 3 java.lang.Byte::hashCode (2 bytes) started 345 0 212 66 1353 3 java.lang.Byte::hashCode (8 bytes) 345 0 276 38 1354 3 java.lang.Byte::hashCode (2 bytes) 345 0 139 167 1352 3 jdk.internal.util.ArraysSupport::hashCode (42 bytes) 345 0 79 354 1348 3 java.lang.invoke.MemberName::hashCode (43 bytes) 345 0 123 107 1350 3 java.util.Objects::hash (5 bytes) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2847631825 From lucy at openjdk.org Fri May 2 16:36:50 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 2 May 2025 16:36:50 GMT Subject: RFR: 8350182: [s390x] Relativize locals in interpreter frames [v3] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 12:43:26 GMT, Amit Kumar wrote: >> Port for [JDK-8299795](https://bugs.openjdk.org/browse/JDK-8299795) Relativize Z_locals in interpreter frame for s390x. >> >> Tier1 test with fastdebug vm are clean. > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > use Z_R0 as temp Changes look good to me. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23660#pullrequestreview-2812444832 From duke at openjdk.org Fri May 2 19:59:11 2025 From: duke at openjdk.org (duke) Date: Fri, 2 May 2025 19:59:11 GMT Subject: Withdrawn: 8337217: Port VirtualMemoryTracker to use VMATree In-Reply-To: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: On Thu, 1 Aug 2024 15:44:32 GMT, Afshin Zafari wrote: > - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. > - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. > - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. > - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/20425 From sspitsyn at openjdk.org Fri May 2 20:53:30 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 2 May 2025 20:53:30 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: patch from Patricio with alternate approach ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/f2c4a136..168c1252 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=04-05 Stats: 253 lines in 9 files changed: 84 ins; 104 del; 65 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From duke at openjdk.org Fri May 2 22:31:00 2025 From: duke at openjdk.org (Mohamed Issa) Date: Fri, 2 May 2025 22:31:00 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v2] In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: > The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. > > The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b15](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B15) as the baseline version. > > For performance data collected with the built in **cbrt** micro-benchmark, see the table below. Each result is the mean of 8 individual runs. Overall, the intrinsic provides a performance uplift of 41%. > > | Benchmark | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup | > | :----------------: | :----------------------------------: | :----------------------------------: | :---------: | > | MathBench.cbrt | 148242 | 209122 | 1.41x | > > Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes. Mohamed Issa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'openjdk:master' into user/missa-prime/cbrt - Change coeff_table alignment from 4 bytes to 16 bytes to conform with movapd instruction - Merge branch 'master' into user/missa-prime/cbrt - x86_64 intrinsic for cbrt using libm ------------- Changes: https://git.openjdk.org/jdk/pull/24470/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=01 Stats: 466 lines in 26 files changed: 453 ins; 1 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/24470.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470 PR: https://git.openjdk.org/jdk/pull/24470 From iveresov at openjdk.org Sat May 3 01:13:41 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sat, 3 May 2025 01:13:41 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v11] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with two additional commits since the last revision: - Fix additional issues - Make sure command line flags that affect MDO layout are consistent ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/014b0ec5..9676039c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=09-10 Stats: 54 lines in 3 files changed: 52 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From zgu at openjdk.org Sat May 3 02:00:57 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Sat, 3 May 2025 02:00:57 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> Message-ID: <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> On Fri, 2 May 2025 11:38:31 GMT, Coleen Phillimore wrote: >> I guess we rely on memory ordering by address dependency on the reader's side. > > Yes we only used ThreadCritical to add to the list after the VM becomes multithreaded. The list traversals (reads) are for hs_err_pid file printing, which presumably is single threaded at that point and presumably nothing is adding to the list. I added Atomic::load() but I think it's not necessary to be a load_acquire. > There are only about 10 items max on this list so far, so I think performance isn't a concern either way. There is also a jcmd to print out events. In theory, that could be a problem if CAS is unordered. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2072281836 From iveresov at openjdk.org Sat May 3 05:25:35 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sat, 3 May 2025 05:25:35 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v12] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Fix compile ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/9676039c..2441ad71 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=10-11 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From qamai at openjdk.org Sat May 3 08:17:50 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Sat, 3 May 2025 08:17:50 GMT Subject: RFR: 8354954: Typed static memory for late initialization of static class members in Hotspot [v11] In-Reply-To: References: Message-ID: <7ZB3DJhP4ez4OQh8Mklzx_wNGLKEy6UjbKOs3ZPoP7g=.0894cef3-dc74-490e-bc8a-14c252bb56e7@github.com> On Tue, 29 Apr 2025 08:49:35 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a `StableValue` which is sized and aligned identically to a `T`, with the difference that a `StableValue` needs to be explicitly instantiated. >> >> Dynamic static initalization in C++ leads to unpredictable bugs as there is no defined order in which objects will be initialized, to the degree that 'static initialization fiasco' is a term used. In the code I've worked on in Hotspot we resolve this by having an initialization function, and instead of having static members of `T` we have `T*` instead and use `malloc` in order to gain the memory for the objects. This is workable, but is unnecessary. >> >> That's why I'd like to have `StableValue`. It let's you avoid the whole `malloc` thing, and we overload `->` to make it behave as if it is actually a `T`. We add in a simple checker in debug mode that checks whether the memory has been initialized before using it. >> >> In the code I've switched two members to be of `StableValue` instead. One is the malloc case above, the second (MemBaseline) is one where I got a bug while developing. The bug occurred because I changed the initializer of `MemBaseline` without knowing that it was dynamic-static-allocated, and the exact change I made caused weird crashes (because of initialization order issues). >> >> This solution is quite practical to me, but I wanted to know what others think. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Delete overzealous assert Marked as reviewed by qamai (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24689#pullrequestreview-2813184841 From amitkumar at openjdk.org Mon May 5 04:06:56 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 5 May 2025 04:06:56 GMT Subject: RFR: 8350182: [s390x] Relativize locals in interpreter frames [v3] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 12:43:26 GMT, Amit Kumar wrote: >> Port for [JDK-8299795](https://bugs.openjdk.org/browse/JDK-8299795) Relativize Z_locals in interpreter frame for s390x. >> >> Tier1 test with fastdebug vm are clean. > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > use Z_R0 as temp Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23660#issuecomment-2849852386 From amitkumar at openjdk.org Mon May 5 04:06:57 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 5 May 2025 04:06:57 GMT Subject: Integrated: 8350182: [s390x] Relativize locals in interpreter frames In-Reply-To: References: Message-ID: On Mon, 17 Feb 2025 09:53:37 GMT, Amit Kumar wrote: > Port for [JDK-8299795](https://bugs.openjdk.org/browse/JDK-8299795) Relativize Z_locals in interpreter frame for s390x. > > Tier1 test with fastdebug vm are clean. This pull request has now been integrated. Changeset: 5b3ae921 Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/5b3ae9210564c16b4d350dabd0445248cb205698 Stats: 27 lines in 5 files changed: 20 ins; 0 del; 7 mod 8350182: [s390x] Relativize locals in interpreter frames Reviewed-by: lucy, rrich ------------- PR: https://git.openjdk.org/jdk/pull/23660 From amitkumar at openjdk.org Mon May 5 04:09:46 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 5 May 2025 04:09:46 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v15] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Wed, 30 Apr 2025 17:02:12 GMT, Markus Gr?nlund wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Configuration and test for jdk.SafepointLatency event > > The issue is that the CPU context can be retrieved here after the safepoint poll has been tested. That is causing a race, because a sample would be taken for an fp that is about to pop, breaking the invariant of the sampling mechanism. > > It is only for some sensitive interpreter positions that we need to inspect the correct fp (the sender's fp), to avoid this race. > > On x64, we signal that by preemptively moving rbp, first to update the CPU context and then by explicitly setting the sender_java_fp field in the LJF. > > With your suggestion, we would always prioritize the sender fp (because it is always available), which is unnecessary and incorrect (biased), except for where we are about to pop an interpreter frame (but we can't decide when that is the case). > > For testing, you will need to run some longer stress tests to see the effect of a racy sampling attempt. > > To provoke taking more samples, you can decrease the sampling interval of JFR by setting the following in default.jfc and / or profile.jfc: > > `diff --git a/src/jdk.jfr/share/conf/jfr/profile.jfc b/src/jdk.jfr/share/conf/jfr/profile.jfc > index 4c9f4b4f8ec..75f8d75c580 100644 > --- a/src/jdk.jfr/share/conf/jfr/profile.jfc > +++ b/src/jdk.jfr/share/conf/jfr/profile.jfc > @@ -198,12 +198,12 @@ > > > true > - 10 ms > + 1 ms > > > > true > - 20 ms > + 1 ms > > > ` > > Try running some longer stress test or benchmark, passing: > > `-XX:StartFlightRecording:settings=profile.jfc` Hi @mgronlun , Is it possible to get head stream changes in this PR, if there is no objection from other architecture? It would be good to have changes from https://github.com/openjdk/jdk/pull/23660 to implement the build-fix for s390x. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2849857901 From amitkumar at openjdk.org Mon May 5 04:14:47 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 5 May 2025 04:14:47 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v15] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: <2qvWWnC2j2hZ4DuQAXedEor_hAzbqE9NMK4XpNtDrHc=.969bb3cc-c8d6-40af-8249-64b2c71c1011@github.com> On Tue, 29 Apr 2025 16:47:17 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338). >> >> Implementations in this change set are provided and have been tested on the following platforms: >> >> - windows-x64 >> - windows-x64-debug >> - linux-x64 >> - linux-x64-debug >> - macosx-x64 >> - macosx-x64-debug >> - linux-aarch64 >> - linux-aarch64-debug >> - macosx-aarch64 >> - macosx-aarch64-debug >> >> Testing: tier1-6, jdk_jfr, stress testing. >> >> Platform porters note: >> Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes: >> >> - src/hotspot/cpu/x86/frame_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.hpp >> - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp >> - src/hotspot/cpu/x86/macroAssembler_x86.cpp >> - src/hotspot/cpu/x86/macroAssembler_x86.hpp >> - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp >> - src/hotspot/cpu/x86/templateTable_x86.cpp >> - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > Configuration and test for jdk.SafepointLatency event I think I will just do the merge locally and create the Patch. Once all other relativization PR merges, I will pass the conflict-free patch to you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2849864644 From iklam at openjdk.org Mon May 5 04:58:23 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 5 May 2025 04:58:23 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache Message-ID: When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. However, we have found two cases when the above scheme doesn't work. Please see the new test cases. The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. ------------- Commit messages: - fixed whitespaces - Fixed obsolete comment - Do not change the order of FinalImageRecipes::apply_recipe yet .. fix this in a separate bug - Step 2: archive all strings in StringTable - Step 1: Fixing NonFinalStaticWithInitVal_Helper Changes: https://git.openjdk.org/jdk/pull/25026/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25026&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356125 Stats: 330 lines in 12 files changed: 210 ins; 88 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/25026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25026/head:pull/25026 PR: https://git.openjdk.org/jdk/pull/25026 From chagedorn at openjdk.org Mon May 5 06:15:48 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 5 May 2025 06:15:48 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs [v2] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 08:43:23 GMT, Marc Chevalier wrote: >> Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. >> >> The unmentioned ones: >> - `ccp` >> - `ciReplay` >> - `ciTypeFlow` >> - `compilercontrol` >> - `debug` >> - `oracle` >> - `predicates` >> - `print` >> - `relocations` >> - `sharedstubs` >> - `splitif` >> - `tiered` >> - `whitebox` >> >> And those, that are not test folders: >> - `lib` >> - `patches` >> - `testlibraries` >> >> I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. >> >> The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. >> >> Feel free to tell if other folders should be included (and in which tier). >> >> Thanks, >> Marc > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > speed up slowest test Marked as reviewed by chagedorn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24817#pullrequestreview-2813874515 From rvansa at openjdk.org Mon May 5 06:30:49 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 06:30:49 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <-Xq9NUkXDnlFyKgh2juvL8zOFSxtc7AhEVBxfxo6w0Y=.2a611ac0-aca3-4ba6-beff-fb3daddba57a@github.com> On Wed, 30 Apr 2025 20:12:17 GMT, Frederic Parain wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Fix VerifyRawIndexesTest >> - Fix reordering in layout and annotations >> - Use qsort_r for different platforms > > src/hotspot/share/oops/fieldInfo.hpp line 223: > >> 221: }; >> 222: >> 223: #define JUMP_TABLE_STRIDE 16 > > How was the threshold of 16 determined? I haven't done any benchmarks looking for the optimal value; this should balance the extra memory footprint vs. improved performance. Also I was hoping to not affect the bulk of Java code; rather optimize "big" classes that show degraded performance due to O(N) lookup. How exactly could the optimization function look like if we're to weigh in both memory consumption and execution speed? > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Field.java line 115: > >> 113: int numFields = numJavaFields + numInjectedFields; >> 114: // JumpTable is generated only for classes with > 16 (non-injected) fields >> 115: if (numJavaFields > 16) { > > The test should use the `JUMP_TABLE_STRIDE` constant. Sure, I can isolate this into static final var, though since this is Java code I can't really take the value from the macro. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2072899071 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2072899858 From rvansa at openjdk.org Mon May 5 06:38:47 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 06:38:47 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Wed, 30 Apr 2025 20:19:50 GMT, Frederic Parain wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Fix VerifyRawIndexesTest >> - Fix reordering in layout and annotations >> - Use qsort_r for different platforms > > src/hotspot/share/oops/fieldInfo.cpp line 52: > >> 50: >> 51: int FieldInfoStream::compare_symbols(const Symbol *s1, const Symbol *s2) { >> 52: // not lexicographical sort, since we need only total ordering > > If only a total ordering is required, why defining a new method instead of reusing Symbol::fast_compare() ? The problem is CDS; I have really started with `fast_compare()`, but after dehydration the pointers changed and the comparison did not work anymore. This is also a reason why I could not use the hashcode for the ordering. If you'd prefer lexicographical sort (just a few extra lines) I could use that one... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2072906046 From rvansa at openjdk.org Mon May 5 06:41:45 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 06:41:45 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:44:04 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms Thanks for test investigation! I noticed that some code could be dependent on the order of fields, though to my best knowledge the specification does not guarantee that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2850046799 From mchevalier at openjdk.org Mon May 5 06:46:45 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 5 May 2025 06:46:45 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs [v2] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 08:43:23 GMT, Marc Chevalier wrote: >> Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. >> >> The unmentioned ones: >> - `ccp` >> - `ciReplay` >> - `ciTypeFlow` >> - `compilercontrol` >> - `debug` >> - `oracle` >> - `predicates` >> - `print` >> - `relocations` >> - `sharedstubs` >> - `splitif` >> - `tiered` >> - `whitebox` >> >> And those, that are not test folders: >> - `lib` >> - `patches` >> - `testlibraries` >> >> I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. >> >> The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. >> >> Feel free to tell if other folders should be included (and in which tier). >> >> Thanks, >> Marc > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > speed up slowest test Thanks @lmesnik @vnkozlov and @chhagedorn for comments and reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24817#issuecomment-2850055256 From duke at openjdk.org Mon May 5 06:46:45 2025 From: duke at openjdk.org (duke) Date: Mon, 5 May 2025 06:46:45 GMT Subject: RFR: 8354284: Add more compiler test folders to tier1 runs [v2] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 08:43:23 GMT, Marc Chevalier wrote: >> Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. >> >> The unmentioned ones: >> - `ccp` >> - `ciReplay` >> - `ciTypeFlow` >> - `compilercontrol` >> - `debug` >> - `oracle` >> - `predicates` >> - `print` >> - `relocations` >> - `sharedstubs` >> - `splitif` >> - `tiered` >> - `whitebox` >> >> And those, that are not test folders: >> - `lib` >> - `patches` >> - `testlibraries` >> >> I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. >> >> The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. >> >> Feel free to tell if other folders should be included (and in which tier). >> >> Thanks, >> Marc > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > speed up slowest test @marc-chevalier Your change (at version 3232e5b8b2424ee75683fbf387fead6c016987d3) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24817#issuecomment-2850056181 From rvansa at openjdk.org Mon May 5 06:51:31 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 06:51:31 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Move constant to static final var ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/ef69ec06..7fb8d340 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=02-03 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From mchevalier at openjdk.org Mon May 5 06:59:51 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 5 May 2025 06:59:51 GMT Subject: Integrated: 8354284: Add more compiler test folders to tier1 runs In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 08:44:04 GMT, Marc Chevalier wrote: > Some folders in jtreg/compiler have been reported not to be run in any tier, while tier1 was probably intended, but the tier definition was mistakenly not updated. I've checked which folders are not referenced into `TEST.groups`. > > The unmentioned ones: > - `ccp` > - `ciReplay` > - `ciTypeFlow` > - `compilercontrol` > - `debug` > - `oracle` > - `predicates` > - `print` > - `relocations` > - `sharedstubs` > - `splitif` > - `tiered` > - `whitebox` > > And those, that are not test folders: > - `lib` > - `patches` > - `testlibraries` > > I'm adding `ccp`, `ciTypeFlow`, `predicates`, `sharedstubs` and `splitif` to tier1. > > The other folders seems to have been around for very long (since at least mid-2021). It's not clear how meaningful it'd be to add them/what the intent from them was. I've rather focused on the recently(-ish) added folders, that one forgot to put in a tier when adding it. > > Feel free to tell if other folders should be included (and in which tier). > > Thanks, > Marc This pull request has now been integrated. Changeset: 69d0f7a3 Author: Marc Chevalier Committer: Christian Hagedorn URL: https://git.openjdk.org/jdk/commit/69d0f7a3954048da358bd2ac5ab458fb37fa25a6 Stats: 9 lines in 2 files changed: 6 ins; 1 del; 2 mod 8354284: Add more compiler test folders to tier1 runs Reviewed-by: chagedorn, kvn ------------- PR: https://git.openjdk.org/jdk/pull/24817 From jsjolen at openjdk.org Mon May 5 07:00:59 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 07:00:59 GMT Subject: RFR: 8354954: Typed static memory for late initialization of static class members in Hotspot [v11] In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 08:49:35 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a `StableValue` which is sized and aligned identically to a `T`, with the difference that a `StableValue` needs to be explicitly instantiated. >> >> Dynamic static initalization in C++ leads to unpredictable bugs as there is no defined order in which objects will be initialized, to the degree that 'static initialization fiasco' is a term used. In the code I've worked on in Hotspot we resolve this by having an initialization function, and instead of having static members of `T` we have `T*` instead and use `malloc` in order to gain the memory for the objects. This is workable, but is unnecessary. >> >> That's why I'd like to have `StableValue`. It let's you avoid the whole `malloc` thing, and we overload `->` to make it behave as if it is actually a `T`. We add in a simple checker in debug mode that checks whether the memory has been initialized before using it. >> >> In the code I've switched two members to be of `StableValue` instead. One is the malloc case above, the second (MemBaseline) is one where I got a bug while developing. The bug occurred because I changed the initializer of `MemBaseline` without knowing that it was dynamic-static-allocated, and the exact change I made caused weird crashes (because of initialization order issues). >> >> This solution is quite practical to me, but I wanted to know what others think. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Delete overzealous assert Thank you, all! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24689#issuecomment-2850080303 From jsjolen at openjdk.org Mon May 5 07:01:01 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 07:01:01 GMT Subject: Integrated: 8354954: Typed static memory for late initialization of static class members in Hotspot In-Reply-To: References: Message-ID: On Wed, 16 Apr 2025 12:36:45 GMT, Johan Sj?len wrote: > Hi, > > This PR introduces a `StableValue` which is sized and aligned identically to a `T`, with the difference that a `StableValue` needs to be explicitly instantiated. > > Dynamic static initalization in C++ leads to unpredictable bugs as there is no defined order in which objects will be initialized, to the degree that 'static initialization fiasco' is a term used. In the code I've worked on in Hotspot we resolve this by having an initialization function, and instead of having static members of `T` we have `T*` instead and use `malloc` in order to gain the memory for the objects. This is workable, but is unnecessary. > > That's why I'd like to have `StableValue`. It let's you avoid the whole `malloc` thing, and we overload `->` to make it behave as if it is actually a `T`. We add in a simple checker in debug mode that checks whether the memory has been initialized before using it. > > In the code I've switched two members to be of `StableValue` instead. One is the malloc case above, the second (MemBaseline) is one where I got a bug while developing. The bug occurred because I changed the initializer of `MemBaseline` without knowing that it was dynamic-static-allocated, and the exact change I made caused weird crashes (because of initialization order issues). > > This solution is quite practical to me, but I wanted to know what others think. This pull request has now been integrated. Changeset: 604225fb Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/604225fb0c5f6bf2128a305d09649d76c43dedc9 Stats: 91 lines in 5 files changed: 81 ins; 3 del; 7 mod 8354954: Typed static memory for late initialization of static class members in Hotspot Reviewed-by: qamai, kbarrett, jvernee ------------- PR: https://git.openjdk.org/jdk/pull/24689 From jsjolen at openjdk.org Mon May 5 07:18:52 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 07:18:52 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments [v2] In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 09:39:49 GMT, Johan Sj?len wrote: >> Hi, >> >> I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Copyright Cheers! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24848#issuecomment-2850110102 From jsjolen at openjdk.org Mon May 5 07:18:53 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 07:18:53 GMT Subject: Integrated: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:09:36 GMT, Johan Sj?len wrote: > Hi, > > I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. > > Testing: GHA Tier1 This pull request has now been integrated. Changeset: 8511220f Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/8511220f9dd1428f9793ead43c20ed197881ab36 Stats: 40 lines in 2 files changed: 0 ins; 8 del; 32 mod 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments Reviewed-by: amenkov, sspitsyn, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/24848 From jbechberger at openjdk.org Mon May 5 08:11:52 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 08:11:52 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v15] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Tue, 29 Apr 2025 16:47:17 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338). >> >> Implementations in this change set are provided and have been tested on the following platforms: >> >> - windows-x64 >> - windows-x64-debug >> - linux-x64 >> - linux-x64-debug >> - macosx-x64 >> - macosx-x64-debug >> - linux-aarch64 >> - linux-aarch64-debug >> - macosx-aarch64 >> - macosx-aarch64-debug >> >> Testing: tier1-6, jdk_jfr, stress testing. >> >> Platform porters note: >> Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes: >> >> - src/hotspot/cpu/x86/frame_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.hpp >> - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp >> - src/hotspot/cpu/x86/macroAssembler_x86.cpp >> - src/hotspot/cpu/x86/macroAssembler_x86.hpp >> - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp >> - src/hotspot/cpu/x86/templateTable_x86.cpp >> - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > Configuration and test for jdk.SafepointLatency event Is there the possibility of adding a `bias` flag to the ExecutionSample events to record when an event has a clear safepoint bias? This could mark all samples where the sampler falls back on the safepoint frame. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2850225694 From jbechberger at openjdk.org Mon May 5 08:15:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 08:15:58 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Fri, 2 May 2025 15:47:04 GMT, Zhengyu Gu wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove assertions > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 66: > >> 64: } >> 65: } while (Atomic::cmpxchg(&_head, elementIndex, elementIndex + 1) != elementIndex); >> 66: _data[elementIndex] = element; > > I question the correctness of the implementation. Consider following scenario: > T1: equeue(): after complete CAS successfully, e.g. `_head = 3, elementIndex = 2` > T2: dequeue(): after complete CAS successfully, `_head = 2, elementIndex = 3` > T2: read `_data[--elementIndex]` // _data[2] has yet set > T1: write `_data[elementIndex] = element` // set _data[2] value You're right. But this seems to be an inherent problem of stacks. I'm going to use the previous lockless queue implementation for the fresh frames queue. The problem should not occur with the thread-local queues though? > src/hotspot/share/runtime/handshake.hpp line 133: > >> 131: >> 132: bool can_run(); >> 133: bool can_run(bool allow_suspend, bool check_async_exception); > > You may want to inline these three methods, looks like that they can be on hot pathes. Would that make a difference? They only defer to other methods and haven't been inlined before. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073020459 PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073022843 From mdoerr at openjdk.org Mon May 5 08:34:52 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 08:34:52 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: On Sat, 3 May 2025 01:57:53 GMT, Zhengyu Gu wrote: >> Yes we only used ThreadCritical to add to the list after the VM becomes multithreaded. The list traversals (reads) are for hs_err_pid file printing, which presumably is single threaded at that point and presumably nothing is adding to the list. I added Atomic::load() but I think it's not necessary to be a load_acquire. >> There are only about 10 items max on this list so far, so I think performance isn't a concern either way. > > There is also a jcmd to print out events. In theory, that could be a problem if CAS is unordered. Then, memory barriers on the reader's side should also be checked. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073046694 From mgronlun at openjdk.org Mon May 5 08:50:35 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 08:50:35 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: > Greetings, > > This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338). > > Implementations in this change set are provided and have been tested on the following platforms: > > - windows-x64 > - windows-x64-debug > - linux-x64 > - linux-x64-debug > - macosx-x64 > - macosx-x64-debug > - linux-aarch64 > - linux-aarch64-debug > - macosx-aarch64 > - macosx-aarch64-debug > > Testing: tier1-6, jdk_jfr, stress testing. > > Platform porters note: > Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes: > > - src/hotspot/cpu/x86/frame_x86.cpp > - src/hotspot/cpu/x86/interp_masm_x86.cpp > - src/hotspot/cpu/x86/interp_masm_x86.hpp > - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp > - src/hotspot/cpu/x86/macroAssembler_x86.cpp > - src/hotspot/cpu/x86/macroAssembler_x86.hpp > - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp > - src/hotspot/cpu/x86/templateTable_x86.cpp > - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp > > Thanks > Markus Markus Gr?nlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge branch 'master' into 8352251 - Configuration and test for jdk.SafepointLatency event - include guards - push back pd constants into pd code - Attempt to build Windows-AARCH64 - No invariants for sender_for_interpreter_frame - zero - Merge branch 'master' into 8352251 - Refine SamplingLatency event description - Update default.jfc - ... and 9 more: https://git.openjdk.org/jdk/compare/8511220f...e448090e ------------- Changes: https://git.openjdk.org/jdk/pull/24296/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24296&range=15 Stats: 3381 lines in 82 files changed: 2071 ins; 960 del; 350 mod Patch: https://git.openjdk.org/jdk/pull/24296.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24296/head:pull/24296 PR: https://git.openjdk.org/jdk/pull/24296 From mgronlun at openjdk.org Mon May 5 08:50:35 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 08:50:35 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v15] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Wed, 30 Apr 2025 17:02:12 GMT, Markus Gr?nlund wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Configuration and test for jdk.SafepointLatency event > > The issue is that the CPU context can be retrieved here after the safepoint poll has been tested. That is causing a race, because a sample would be taken for an fp that is about to pop, breaking the invariant of the sampling mechanism. > > It is only for some sensitive interpreter positions that we need to inspect the correct fp (the sender's fp), to avoid this race. > > On x64, we signal that by preemptively moving rbp, first to update the CPU context and then by explicitly setting the sender_java_fp field in the LJF. > > With your suggestion, we would always prioritize the sender fp (because it is always available), which is unnecessary and incorrect (biased), except for where we are about to pop an interpreter frame (but we can't decide when that is the case). > > For testing, you will need to run some longer stress tests to see the effect of a racy sampling attempt. > > To provoke taking more samples, you can decrease the sampling interval of JFR by setting the following in default.jfc and / or profile.jfc: > > `diff --git a/src/jdk.jfr/share/conf/jfr/profile.jfc b/src/jdk.jfr/share/conf/jfr/profile.jfc > index 4c9f4b4f8ec..75f8d75c580 100644 > --- a/src/jdk.jfr/share/conf/jfr/profile.jfc > +++ b/src/jdk.jfr/share/conf/jfr/profile.jfc > @@ -198,12 +198,12 @@ > > > true > - 10 ms > + 1 ms > > > > true > - 20 ms > + 1 ms > > > ` > > Try running some longer stress test or benchmark, passing: > > `-XX:StartFlightRecording:settings=profile.jfc` > Hi @mgronlun , Is it possible to get head stream changes in this PR, if there is no objection from other architecture? It would be good to have changes from #23660 to implement the build-fix for s390x. Thanks HI Amit, now the PR has been merged with master and should contain your changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2850319187 From sspitsyn at openjdk.org Mon May 5 10:29:47 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 5 May 2025 10:29:47 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: References: Message-ID: <5lfUq5HfAC5gCZClRSo-diFj7LOjn6vm-PZbmC-Y2KY=.e0ab99e8-4fe0-4b6c-8f8a-8cf1fe69500e@github.com> On Fri, 2 May 2025 20:53:30 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: patch from Patricio with alternate approach I've pushed update with Patricio's suggestion. Mach5 testing is green. Also, I've decided to keep my fixes in the `jvmtiThreadState.?pp`. It feels it does not worth to separate those. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2850564367 From jbechberger at openjdk.org Mon May 5 11:37:49 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 11:37:49 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: > This is the code for the [JEP draft: CPU Time based profiling for JFR]. > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). > > A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix missing threads and other things ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20752/files - new: https://git.openjdk.org/jdk/pull/20752/files/eb3ab54e..ed67e11a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=45 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=44-45 Stats: 166 lines in 4 files changed: 146 ins; 3 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/20752.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20752/head:pull/20752 PR: https://git.openjdk.org/jdk/pull/20752 From rehn at openjdk.org Mon May 5 11:46:55 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 5 May 2025 11:46:55 GMT Subject: Integrated: 8352730: RISC-V: Disable tests in qemu-user In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 14:19:55 GMT, Robbin Ehn wrote: > Hi, for you to consider. > > These tests constantly fails in qemu-user. > Either the require host to be same arch explicit or implicit (sysroot). > E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. > > From bug: >> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >> We add this uarch to CPU feature string. >> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. > > Relevant qemu code: > https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 > > Relevant hotspot code: > https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 > > Tested that the require only filters out tests in qemu+riscv64. > > Thanks! > > /Robbin This pull request has now been integrated. Changeset: 02647976 Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/026479767c011227b63e7fdb8a38f61977782249 Stats: 71 lines in 64 files changed: 71 ins; 0 del; 0 mod 8352730: RISC-V: Disable tests in qemu-user Reviewed-by: fyang, mli ------------- PR: https://git.openjdk.org/jdk/pull/24229 From coleenp at openjdk.org Mon May 5 11:48:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 5 May 2025 11:48:53 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: On Mon, 5 May 2025 08:31:55 GMT, Martin Doerr wrote: >> There is also a jcmd to print out events. In theory, that could be a problem if CAS is unordered. > > Then, memory barriers on the reader's side should also be checked. hm then should the readers be Atomic::load_acquire() ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073293346 From aboldtch at openjdk.org Mon May 5 12:18:45 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 5 May 2025 12:18:45 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 02:29:34 GMT, Quan Anh Mai wrote: >> This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 >> >> ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] >> >> In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. >> >> This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 >> >> >> PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > > What I meant is that we should map a relocation to BOTH the instruction start and the patch site. APX has not even released yet so I think it is more efficient to make a better fix than to make a quicker one. I think @merykitty solution with two different relocations based on wether we support APX or not. And only emit the after and nop when `VM_Version::supports_apx_f()` is true. On the other hand maybe we can solve this with a minimal change by simply looking for the REX2 prefix when we patch the code. Something along the line of: diff --git a/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp index 9cdf0b229c0..4a956b450bd 100644 --- a/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp +++ b/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp @@ -1328,7 +1328,13 @@ void ZBarrierSetAssembler::patch_barrier_relocation(address addr, int format) { const uint16_t value = patch_barrier_relocation_value(format); uint8_t* const patch_addr = (uint8_t*)addr + offset; if (format == ZBarrierRelocationFormatLoadGoodBeforeShl) { - *patch_addr = (uint8_t)value; + if (VM_Version::supports_apx_f()) { + NativeInstruction* instruction = nativeInstruction_at(addr); + uint8_t* const rex2_patch_addr = patch_addr + (instruction->has_rex2_prefix() ? 1 : 0); + *rex2_patch_addr = (uint8_t)value; + } else { + *patch_addr = (uint8_t)value; + } } else { *(uint16_t*)patch_addr = value; } As for the solution to have the relocation point at the entry. While they were not designed to be used this way, It looks like it works. (At least from a barrier patching point of view, as we only want to iterate over all relocations, never map a PC to an relocation). But changing invariants are scary. And is probably better to evaluate as a part of the [JDK-8355341](https://bugs.openjdk.org/browse/JDK-8355341) RFE. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24919#issuecomment-2850807205 From zgu at openjdk.org Mon May 5 12:30:52 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 12:30:52 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: On Mon, 5 May 2025 11:46:12 GMT, Coleen Phillimore wrote: >> Then, memory barriers on the reader's side should also be checked. > > hm then should the readers be Atomic::load_acquire() ? The version you merged has `memory_order_conservative` order on CAS, so you don't need reader side barrier. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073348568 From zgu at openjdk.org Mon May 5 12:51:59 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 12:51:59 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 11:37:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP draft: CPU Time based profiling for JFR]. >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). >> >> A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix missing threads and other things src/hotspot/share/runtime/handshake.hpp line 135: > 133: bool can_run(bool allow_suspend, bool check_async_exception); > 134: > 135: bool has_operation(); `has_operation()` was an inline method, it was called from `SafepointMechanism::should_process()` (now, it calls `can_run`). I believe `SafepointMechanism::should_process()` is a latency sensitive and inlined method. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073377951 From rvansa at openjdk.org Mon May 5 12:54:52 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 12:54:52 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 06:51:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Move constant to static final var For the record, in the ideal case I would like to backport this into JDK 21 as well. Do you think that the change in iteration order would be problematic for that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2850897668 From zgu at openjdk.org Mon May 5 12:54:58 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 12:54:58 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Mon, 5 May 2025 08:11:35 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 66: >> >>> 64: } >>> 65: } while (Atomic::cmpxchg(&_head, elementIndex, elementIndex + 1) != elementIndex); >>> 66: _data[elementIndex] = element; >> >> I question the correctness of the implementation. Consider following scenario: >> T1: equeue(): after complete CAS successfully, e.g. `_head = 3, elementIndex = 2` >> T2: dequeue(): after complete CAS successfully, `_head = 2, elementIndex = 3` >> T2: read `_data[--elementIndex]` // _data[2] has yet set >> T1: write `_data[elementIndex] = element` // set _data[2] value > > You're right. But this seems to be an inherent problem of stacks. I'm going to use the previous lockless queue implementation for the fresh frames queue. The problem should not occur with the thread-local queues though? Hotspot does have an implementation of [lock free stack]( https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/lockFreeStack.hpp), but you have to deal with ABA problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073382691 From mdoerr at openjdk.org Mon May 5 12:55:52 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 12:55:52 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: On Mon, 5 May 2025 12:28:17 GMT, Zhengyu Gu wrote: >> hm then should the readers be Atomic::load_acquire() ? > > The version you merged has `memory_order_conservative` order on CAS, so you don't need reader side barrier. Looks like a Release-Consume pattern: https://en.cppreference.com/w/cpp/atomic/memory_order#Release-Consume_ordering I think it is very likely that it works as it is. Not sure if Atomic::load is 100% safe for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073383889 From jbechberger at openjdk.org Mon May 5 13:13:14 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:13:14 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Mon, 5 May 2025 12:52:09 GMT, Zhengyu Gu wrote: >> You're right. But this seems to be an inherent problem of stacks. I'm going to use the previous lockless queue implementation for the fresh frames queue. The problem should not occur with the thread-local queues though? > > Hotspot does have an implementation of [lock free stack]( https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/lockFreeStack.hpp), but you have to deal with ABA problem. Which is not array based and therefore not usable in signal handlers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073420361 From zgu at openjdk.org Mon May 5 13:14:56 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:14:56 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: On Mon, 5 May 2025 12:52:48 GMT, Martin Doerr wrote: >> The version you merged has `memory_order_conservative` order on CAS, so you don't need reader side barrier. > > Looks like a Release-Consume pattern: https://en.cppreference.com/w/cpp/atomic/memory_order#Release-Consume_ordering > I think it is very likely that it works as it is. Not sure if Atomic::load is 100% safe for this. `memory_order_conservative` is [Strong two-way memory barrier](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/atomic.hpp#L47), that means all reads and writes *cannot* float pass the barrier. If `Atomic::load()` observes a store by conservative order, then precedent reads and writes of the store must have completed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073424418 From jsjolen at openjdk.org Mon May 5 13:17:09 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 13:17:09 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Thu, 1 May 2025 14:12:30 GMT, Gerard Ziemski wrote: >> Please review this addition of an internal benchmark, mostly of interest to those working with NMT. >> >> This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. >> >> Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). >> >> The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. >> >> ### To use it: >> >> To record pattern of allocations of memory calls: >> >> `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` >> >> OR to record pattern of allocations of virtual memory calls: >> >> `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` >> >> This will result in the file: >> - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) >> OR >> - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) >> >> And 2 additional files: >> - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) >> - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) >> >> >> then to actually run the benchmark: >> >> NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary >> >> ### Usage: >> >> See the issue for more details and the design document. > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > use permit_forbidden_function for realloc There's commented out code, fix that by getting rid of it or converting logs to UL. Sometimes you use `fprintf(stderr`, sometimes you use `tty->print`, what's the difference and why not use UL? src/hotspot/share/nmt/mallocTracker.cpp line 170: > 168: // Record a malloc memory allocation > 169: void* MallocTracker::record_malloc(void* malloc_base, size_t size, MemTag mem_tag, > 170: const NativeCallStack& stack, void* old_base) Unused src/hotspot/share/nmt/mallocTracker.hpp line 284: > 282: // Record malloc on specified memory block > 283: static void* record_malloc(void* malloc_base, size_t size, MemTag mem_tag, > 284: const NativeCallStack& stack, void* old_base = nullptr); Unused src/hotspot/share/nmt/memLogRecorder.cpp line 155: > 153: // TODO: NMT_LogRecorder::thread_name > 154: #endif > 155: } `Thread::current()->name()` src/hotspot/share/nmt/memLogRecorder.hpp line 136: > 134: address stack[NMT_TrackingStackDepth]; > 135: long int mem_tag; > 136: long int mem_tag_split; Use MemTag? Why `long int`? src/hotspot/share/nmt/memLogRecorder.hpp line 139: > 137: size_t size; > 138: size_t size_split; > 139: int type; Why isn't this a `Type`? src/hotspot/share/nmt/memLogRecorder.hpp line 151: > 149: SPLIT_RESERVED, > 150: TAG > 151: }; Better name than `Type`, like `MemoryOperation`? No need for `ALL_CAPS` names if you don't want to, you can use `ThisTypeOfName` instead. That's a style choice you get to make, though. src/hotspot/share/nmt/memLogRecorder.hpp line 174: > 172: #else // defined(LINUX) || defined(__APPLE__) > 173: > 174: class NMT_LogRecorder : public StackObj { What's the idea behind having two different subclasses for the log recorder? Like, why is it important that two different objects record the two sequences of events? src/hotspot/share/nmt/memLogRecorder.hpp line 185: > 183: > 184: class NMT_MemoryLogRecorder : public NMT_LogRecorder { > 185: public: TODOs? src/hotspot/share/runtime/os.cpp line 739: > 737: // After a successful realloc(3), we account the resized block with its new size > 738: // to NMT. > 739: void* const new_inner_ptr = MemTracker::record_malloc(new_outer_ptr, size, mem_tag, stack, memblock); Unused extra argument ------------- PR Review: https://git.openjdk.org/jdk/pull/23786#pullrequestreview-2814746682 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073411322 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073411520 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073423794 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073415653 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073416162 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073417724 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073414948 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073412512 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2073418728 From mdoerr at openjdk.org Mon May 5 13:26:52 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 13:26:52 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: On Mon, 5 May 2025 13:12:18 GMT, Zhengyu Gu wrote: >> Looks like a Release-Consume pattern: https://en.cppreference.com/w/cpp/atomic/memory_order#Release-Consume_ordering >> I think it is very likely that it works as it is. Not sure if Atomic::load is 100% safe for this. > > `memory_order_conservative` is [Strong two-way memory barrier](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/atomic.hpp#L47), that means all reads and writes *cannot* float pass the barrier. > > If `Atomic::load()` observes a store by conservative order, then precedent reads and writes of the store must have completed. First of all, nothing on the writer's side can fix the reader's side. We're basically relying on volatile load + dependency chain on the reader's side. This is very likely to work, but C++ memory model specialists may find this questionable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073442756 From zgu at openjdk.org Mon May 5 13:29:56 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:29:56 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Mon, 5 May 2025 13:09:43 GMT, Johannes Bechberger wrote: >> Hotspot does have an implementation of [lock free stack]( https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/lockFreeStack.hpp), but you have to deal with ABA problem. > > Which is not array based and therefore not usable in signal handlers. I see. If it is only used in signal handlers, does it really need to be multi-threaded? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073432450 From jsjolen at openjdk.org Mon May 5 13:29:56 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 13:29:56 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Mon, 5 May 2025 13:17:28 GMT, Zhengyu Gu wrote: >> Which is not array based and therefore not usable in signal handlers. > > I see. If it is only used in signal handlers, does it really need to be multi-threaded? The `LockFreeStack` doesn't say how its elements are allocated, so you can allocate your elements into an array. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073432511 From jsjolen at openjdk.org Mon May 5 13:29:56 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 13:29:56 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Mon, 5 May 2025 13:17:31 GMT, Johan Sj?len wrote: >> I see. If it is only used in signal handlers, does it really need to be multi-threaded? > > The `LockFreeStack` doesn't say how its elements are allocated, so you can allocate your elements into an array. ```c++ struct Node { volatile Node* next; int x; static volatile Node* get_next(Node& n) { return n->next; } } static Node nodes[16]; static i = 0; Node* alloc() { return nodes[i++]: } void foo() { LockFreeStack lfs; Node* n = alloc(); n->x = 0; lfs.push(n*); Node* n2 = lfs.top(); tty->print_cr("%d\n", n2->x); lfs.pop(); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073436807 From jbechberger at openjdk.org Mon May 5 13:29:57 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:29:57 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: On Mon, 5 May 2025 13:20:17 GMT, Johan Sj?len wrote: >> The `LockFreeStack` doesn't say how its elements are allocated, so you can allocate your elements into an array. > > ```c++ > struct Node { volatile Node* next; int x; static volatile Node* get_next(Node& n) { return n->next; } } > static Node nodes[16]; > static i = 0; > Node* alloc() { > return nodes[i++]: > } > > void foo() { > LockFreeStack lfs; > Node* n = alloc(); > n->x = 0; > lfs.push(n*); > Node* n2 = lfs.top(); > tty->print_cr("%d\n", n2->x); > lfs.pop(); > } Wouldn't this defeat its purpose? I would need to get the next free element from an array whenever I want to enqueue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073437538 From jsjolen at openjdk.org Mon May 5 13:29:57 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 5 May 2025 13:29:57 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> Message-ID: <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsKtlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> On Mon, 5 May 2025 13:20:44 GMT, Johannes Bechberger wrote: >> ```c++ >> struct Node { volatile Node* next; int x; static volatile Node* get_next(Node& n) { return n->next; } } >> static Node nodes[16]; >> static i = 0; >> Node* alloc() { >> return nodes[i++]: >> } >> >> void foo() { >> LockFreeStack lfs; >> Node* n = alloc(); >> n->x = 0; >> lfs.push(n*); >> Node* n2 = lfs.top(); >> tty->print_cr("%d\n", n2->x); >> lfs.pop(); >> } > > Wouldn't this defeat its purpose? I would need to get the next free element from an array whenever I want to enqueue. But eh, if I'm reading Zenghyu's comments correctly then what I'm saying doesn't matter??? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073440671 From jbechberger at openjdk.org Mon May 5 13:29:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:29:58 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsKtlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsK tlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> Message-ID: On Mon, 5 May 2025 13:22:42 GMT, Johan Sj?len wrote: >> Wouldn't this defeat its purpose? I would need to get the next free element from an array whenever I want to enqueue. > > But eh, if I'm reading Zenghyu's comments correctly then what I'm saying doesn't matter??? > I see. If it is only used in signal handlers, does it really need to be multi-threaded? I might be able to simplify the thread-local stacks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073441661 From jbechberger at openjdk.org Mon May 5 13:29:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:29:58 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsK tlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> Message-ID: On Mon, 5 May 2025 13:23:11 GMT, Johannes Bechberger wrote: >> But eh, if I'm reading Zenghyu's comments correctly then what I'm saying doesn't matter??? > >> I see. If it is only used in signal handlers, does it really need to be multi-threaded? > > I might be able to simplify the thread-local stacks. But the global queue of fresh elements (that are then enqueued into the thread-local stacks) needs to signal handler safe (dequeue in multiple signal handlers in multiple thread at the same time) and is pushed to from all threads at their safepoints. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073446031 From zgu at openjdk.org Mon May 5 13:29:59 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:29:59 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsK tlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> Message-ID: On Mon, 5 May 2025 13:25:32 GMT, Johannes Bechberger wrote: >>> I see. If it is only used in signal handlers, does it really need to be multi-threaded? >> >> I might be able to simplify the thread-local stacks. > > But the global queue of fresh elements (that are then enqueued into the thread-local stacks) needs to signal handler safe (dequeue in multiple signal handlers in multiple thread at the same time) and is pushed to from all threads at their safepoints. You certainly can use pre-allocated array as allocation pool, with additional `next` field. However, the challenge is how to deal with `ABA` problem (or does it have `ABA` problem?). I am not sure you can use `GlobalCounter` inside signal handlers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073446877 From jbechberger at openjdk.org Mon May 5 13:29:59 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:29:59 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsK tlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> Message-ID: On Mon, 5 May 2025 13:26:02 GMT, Zhengyu Gu wrote: >> But the global queue of fresh elements (that are then enqueued into the thread-local stacks) needs to signal handler safe (dequeue in multiple signal handlers in multiple thread at the same time) and is pushed to from all threads at their safepoints. > > You certainly can use pre-allocated array as allocation pool, with additional `next` field. However, the challenge is how to deal with `ABA` problem (or does it have `ABA` problem?). I am not sure you can use `GlobalCounter` inside signal handlers. This is why I replaced it with a queue that doesn't have these problems. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073449392 From jbechberger at openjdk.org Mon May 5 13:30:01 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:30:01 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 12:49:00 GMT, Zhengyu Gu wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix missing threads and other things > > src/hotspot/share/runtime/handshake.hpp line 135: > >> 133: bool can_run(bool allow_suspend, bool check_async_exception); >> 134: >> 135: bool has_operation(); > > `has_operation()` was an inline method, it was called from `SafepointMechanism::should_process()` (now, it calls `can_run`). I believe `SafepointMechanism::should_process()` is a latency sensitive and inlined method. This seems to lead to a circular dependency. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073434009 From zgu at openjdk.org Mon May 5 13:33:04 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:33:04 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> Message-ID: <8usuRM0p901bZWiMnLw6ioL7sPCcpULSWy7PZzfChlQ=.f7c3303e-39d0-4e93-a28c-bccbfedc1242@github.com> On Mon, 5 May 2025 13:23:48 GMT, Martin Doerr wrote: >> `memory_order_conservative` is [Strong two-way memory barrier](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/atomic.hpp#L47), that means all reads and writes *cannot* float pass the barrier. >> >> If `Atomic::load()` observes a store by conservative order, then precedent reads and writes of the store must have completed. > > First of all, nothing on the writer's side can fix the reader's side. > > We're basically relying on volatile load + dependency chain on the reader's side. This is very likely to work, but C++ memory model specialists may find this questionable. Are we on the same page? which `load` you are talking about? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073455957 From zgu at openjdk.org Mon May 5 13:34:55 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:34:55 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsK tlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> Message-ID: On Mon, 5 May 2025 13:27:26 GMT, Johannes Bechberger wrote: >> You certainly can use pre-allocated array as allocation pool, with additional `next` field. However, the challenge is how to deal with `ABA` problem (or does it have `ABA` problem?). I am not sure you can use `GlobalCounter` inside signal handlers. > > This is why I replaced it with a queue that doesn't have these problems. Which queue implementation you were referring? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073459705 From zgu at openjdk.org Mon May 5 13:34:57 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:34:57 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: <6XiZmQFwOTtDHxKtxkJgSQOvqAh5zkCOG_kvYpHeRxE=.1bd831a2-9976-4983-be54-eacebea48f29@github.com> On Mon, 5 May 2025 13:18:27 GMT, Johannes Bechberger wrote: >> src/hotspot/share/runtime/handshake.hpp line 135: >> >>> 133: bool can_run(bool allow_suspend, bool check_async_exception); >>> 134: >>> 135: bool has_operation(); >> >> `has_operation()` was an inline method, it was called from `SafepointMechanism::should_process()` (now, it calls `can_run`). I believe `SafepointMechanism::should_process()` is a latency sensitive and inlined method. > > This seems to lead to a circular dependency. Hmmm, `has_operation()` was inlined. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073458800 From jbechberger at openjdk.org Mon May 5 13:41:56 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:41:56 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: <4G9zGVgfy-GCJkVLYgplDOkjKJrIfFLA7dzfWd1qocc=.19579c4b-7631-4856-b3b4-8901af824b18@github.com> <-BaADrj9uoAliNH_hTcKmNIb_akhLY_UGsK tlRfC5wY=.5d726a8f-c253-4d3b-bf66-778e7bbc14ae@github.com> Message-ID: On Mon, 5 May 2025 13:32:25 GMT, Zhengyu Gu wrote: >> This is why I replaced it with a queue that doesn't have these problems. > > Which queue implementation you were referring? https://github.com/openjdk/jdk/pull/20752/commits/ed67e11a6ccd578d46adc0506e0e80715c375356#diff-6071b4b14d8299f3025b041aa51069d5023e4fadf36d0c4aad5da6008995caa7R244 Developed for a previous version of the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073470131 From jbechberger at openjdk.org Mon May 5 13:41:57 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 13:41:57 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: <6XiZmQFwOTtDHxKtxkJgSQOvqAh5zkCOG_kvYpHeRxE=.1bd831a2-9976-4983-be54-eacebea48f29@github.com> References: <6XiZmQFwOTtDHxKtxkJgSQOvqAh5zkCOG_kvYpHeRxE=.1bd831a2-9976-4983-be54-eacebea48f29@github.com> Message-ID: <489rqeUgKYHDcb7DGNGs78WshL8rMYkSeNWCnl-Y3lQ=.5e998628-6654-4da5-99eb-e552f225ee7d@github.com> On Mon, 5 May 2025 13:31:52 GMT, Zhengyu Gu wrote: >> This seems to lead to a circular dependency. > > Hmmm, `has_operation()` was inlined. It was, but it cannot anymore, because it depends on the JfrThreadLocal now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073468313 From mdoerr at openjdk.org Mon May 5 13:42:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 13:42:57 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: <8usuRM0p901bZWiMnLw6ioL7sPCcpULSWy7PZzfChlQ=.f7c3303e-39d0-4e93-a28c-bccbfedc1242@github.com> References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> <8wZcqTFOczjkZaHuWC4HLMb1vzTcUsiflcbmejfWnWA=.f7889541-ca80-4ee4-acf0-90aa39bb82f4@github.com> <8usuRM0p901bZWiMnLw6ioL7sPCcpULSWy7PZzfChlQ=.f7c3303e-39d0-4e93-a28c-bccbfedc1242@github.com> Message-ID: <_BaYLilP15703bQORg7fW6LDv_Omm01Go0z1sHFkYxo=.32a8d1e1-487a-4a66-ae15-5e2aa541a0a0@github.com> On Mon, 5 May 2025 13:30:29 GMT, Zhengyu Gu wrote: >> First of all, nothing on the writer's side can fix the reader's side. >> >> We're basically relying on volatile load + dependency chain on the reader's side. This is very likely to work, but C++ memory model specialists may find this questionable. > > Are we on the same page? which `load` you are talking about? I'm talking about `Atomic::load()` which is essentially a volatile load without ordering. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073472029 From rkennke at openjdk.org Mon May 5 13:43:23 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 13:43:23 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI Message-ID: In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. Testing: - [x] extensive testing with https://github.com/oracle/graal/pull/10904 ------------- Commit messages: - Fix ordering of includes - Remove unnecessary stuff - Revert unrelated changes - Revert unrelated changes - Merge branch 'master' into graal-shenandoah-support - Support for Shenandoah card-table barriers in JVMCI - Revert "8321373: Build should use LC_ALL=C.UTF-8" - Graal Shenandoah support Changes: https://git.openjdk.org/jdk/pull/25001/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356075 Stats: 59 lines in 6 files changed: 58 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25001.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001 PR: https://git.openjdk.org/jdk/pull/25001 From hgreule at openjdk.org Mon May 5 13:46:47 2025 From: hgreule at openjdk.org (Hannes Greule) Date: Mon, 5 May 2025 13:46:47 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 12:51:47 GMT, Radim Vansa wrote: > For the record, in the ideal case I would like to backport this into JDK 21 as well. Do you think that the change in iteration order would be problematic for that? I think that also affects iteration order of Java methods like `Class#getDeclaredFields()` etc.? While the order is unspecified there, I'm pretty certain that it would break existing code. But even worse, [JVM TI specifies that `GetClassFields` returns fields in the order they occur in the class file](https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#GetClassFields). I assume this is currently broken? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2851062633 From zgu at openjdk.org Mon May 5 13:48:57 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 5 May 2025 13:48:57 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> Message-ID: On Thu, 1 May 2025 06:12:16 GMT, Aleksey Shipilev wrote: >> Maybe @shipilev meant `memory_order_release`? Anyway, I guess we don't need to optimize it. > > I saw no point in enforcing memory ordering mode here, as it looks like we only did `ThreadCritical` for mutual exclusion. Note that we do not have a matching acquire on list traversals, so seqcst/release on list additions would be incomplete. That only reinforces my original thinking: we are riding on memory ordering given by something else, I'd guess the initialization sequence itself. > > But I won't quibble, it is a very minor optimization. I don't believe we need reader side barrier, given the `conservative` order on writer side. @shipilev any opinion? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073483451 From dnsimon at openjdk.org Mon May 5 13:53:50 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 5 May 2025 13:53:50 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI In-Reply-To: References: Message-ID: On Fri, 2 May 2025 10:35:03 GMT, Roman Kennke wrote: > In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. > > Testing: > - [x] extensive testing with https://github.com/oracle/graal/pull/10904 LGTM ------------- Marked as reviewed by dnsimon (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2814890860 From rvansa at openjdk.org Mon May 5 13:59:48 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 13:59:48 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 13:44:32 GMT, Hannes Greule wrote: >> For the record, in the ideal case I would like to backport this into JDK 21 as well. Do you think that the change in iteration order would be problematic for that? > >> For the record, in the ideal case I would like to backport this into JDK 21 as well. Do you think that the change in iteration order would be problematic for that? > > I think that also affects iteration order of Java methods like `Class#getDeclaredFields()` etc.? While the order is unspecified there, I'm pretty certain that it would break existing code. But even worse, [JVM TI specifies that `GetClassFields` returns fields in the order they occur in the class file](https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#GetClassFields). I assume this is currently broken? @SirYwell Oh, that's correct. I haven't noticed that requirement; I guess that this means that the PR needs to be updated. How does the order of iteration cooperate with `@Contended`, though? In `FieldLayoutBuilder::regular_field_sorting` we separate static and contended fields; doesn't that break the requirement? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2851099200 From mgronlun at openjdk.org Mon May 5 14:11:47 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 14:11:47 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v15] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: <2hhZtg0s_JR2jF21GSogLB_giDKBqFa6wwcrdGMdXJI=.c3d46749-caae-4d00-99e4-f017686c6f56@github.com> On Mon, 5 May 2025 08:09:10 GMT, Johannes Bechberger wrote: > Is there the possibility of adding a `bias` flag to the ExecutionSample events to record when an event has a clear safepoint bias? This could mark all samples where the sampler falls back on the safepoint frame. The next phase, called "Sampling Accuracy," will address how to monitor and improve on safepoint biases and failures. It might involve separate events for failure reason tracking, so I prefer to delay the decision until then. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2851136287 From hgreule at openjdk.org Mon May 5 14:17:47 2025 From: hgreule at openjdk.org (Hannes Greule) Date: Mon, 5 May 2025 14:17:47 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 13:57:10 GMT, Radim Vansa wrote: > How does the order of iteration cooperate with `@Contended`, though? In `FieldLayoutBuilder::regular_field_sorting` we separate static and contended fields; doesn't that break the requirement? Hm, I'm not familiar enough with the code there. There seem to be some rather basic tests here https://github.com/openjdk/jdk/tree/1501a5e41e59162a374cf5b8cfc37faced48a6ed/test/hotspot/jtreg/serviceability/jvmti/GetClassFields and here https://github.com/openjdk/jdk/tree/1501a5e41e59162a374cf5b8cfc37faced48a6ed/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields (I think https://github.com/openjdk/jdk/blob/1501a5e41e59162a374cf5b8cfc37faced48a6ed/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007/getclfld007.cpp#L129 actually tests the order), but nothing with `@Contended` involved. Maybe it is worth to expand the test coverage first? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2851151806 From jbechberger at openjdk.org Mon May 5 14:22:07 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 5 May 2025 14:22:07 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v47] In-Reply-To: References: Message-ID: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> > This is the code for the [JEP draft: CPU Time based profiling for JFR]. > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). > > A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Simplify local trace stack ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20752/files - new: https://git.openjdk.org/jdk/pull/20752/files/ed67e11a..93fe0e97 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=46 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20752&range=45-46 Stats: 91 lines in 4 files changed: 36 ins; 33 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/20752.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20752/head:pull/20752 PR: https://git.openjdk.org/jdk/pull/20752 From coleenp at openjdk.org Mon May 5 14:56:49 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 5 May 2025 14:56:49 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 06:51:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Move constant to static final var For Methods we sort them alphabetically but have a field in InstanceKlass called method_ordering to preserve the original order for JVMTI. This would have to have the same thing. I'm wondering if instead this patch could create a cache of the FieldInfoStream in InstanceKlass for certain heuristics. It would add a pointer to InstanceKlass that will be null when there's nothing to cache, but could be used to improve performance of this edge case. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2851271657 From coleenp at openjdk.org Mon May 5 15:02:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 5 May 2025 15:02:52 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 06:51:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Move constant to static final var Also should probably add your test to test/micro/org/openjdk/bench as a JMH ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2851287976 From shade at openjdk.org Mon May 5 15:04:10 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 15:04:10 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v3] In-Reply-To: References: Message-ID: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Add legend - Merge branch 'master' into JDK-8356027-print-compilation-timings - Test TestDuplicatedLateInliningOutput.java - More touchups - Fix TypeProfileFinalMethod as well - Fix inline tree printing - Touchups - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24984/files - new: https://git.openjdk.org/jdk/pull/24984/files/1a3b5a31..4eb0dde9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=01-02 Stats: 15705 lines in 466 files changed: 12161 ins; 2112 del; 1432 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From pchilanomate at openjdk.org Mon May 5 15:10:52 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 5 May 2025 15:10:52 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: References: Message-ID: On Fri, 2 May 2025 20:53:30 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: patch from Patricio with alternate approach Thanks for adopting the suggestion Serguei! src/hotspot/share/prims/jvmtiEnv.cpp line 1124: > 1122: oop carrier_thread = java_lang_VirtualThread::carrier_thread(thread_oop); > 1123: java_thread = carrier_thread == nullptr ? nullptr : java_lang_Thread::thread(carrier_thread); > 1124: } Nit: extra spaces at the end. There are a couple of other instances of this in this file shown by jcheck. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1798: > 1796: return JVMTI_ERROR_THREAD_SUSPENDED; > 1797: } > 1798: if (!java_thread->java_suspend(single_suspend)) { We could use `is_virtual && single_suspend` (same in resume) and change `_handshakee->is_vthread_mounted()` to be an assert in `HandshakeState::set_suspended()`. ------------- PR Review: https://git.openjdk.org/jdk/pull/24269#pullrequestreview-2815125142 PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2073621313 PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2073634025 From mdoerr at openjdk.org Mon May 5 15:23:49 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 15:23:49 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v14] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: <1Rvls0nhGHse5bHwtvPjQqQ9dVkXd2hPWrbKL2hv-ow=.90e4bb43-6e29-48ba-beef-c239e16953e8@github.com> On Wed, 30 Apr 2025 13:50:53 GMT, Martin Doerr wrote: > Can we let _last_sender_Java_fp be a state field that can be tested? I still couldn't hit any failures or errors with my simple version, but I understand that it might be problematic. I have an implementation: https://github.com/TheRealMDoerr/jdk/commit/b2f83fae262f129f864e109d7adce169e28f0c7c Please take a look. I hope we don't need more ;-) I'm planning to run more test when I find more time. If `_last_sender_Java_fp` is needed on all platforms, shouldn't it be better moved to shared javaFrameAnchor.hpp and javaThread.hpp? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2851350228 From shade at openjdk.org Mon May 5 15:27:47 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 15:27:47 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: References: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> Message-ID: On Fri, 2 May 2025 16:27:27 GMT, Ashutosh Mehra wrote: > While you are refactoring this, if you can also add a title to the `PrintCompilation` output. like the following, it would immensely helpful as it would avoid the need to dig up the code to understand what these numbers stand for. Added in new commit. I thought about printing the legend every n-th time, but there is a major wrinkle: the generic `print_impl` gets passed a stream, so we do not know if we have printed header there or not. Or whether indeed we should print a header there. So I opted to print it once from compiler initialization code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2851362344 From shade at openjdk.org Mon May 5 15:38:46 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 15:38:46 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI In-Reply-To: References: Message-ID: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> On Fri, 2 May 2025 10:35:03 GMT, Roman Kennke wrote: > In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. > > Testing: > - [x] extensive testing with https://github.com/oracle/graal/pull/10904 A few questions: src/hotspot/share/gc/shenandoah/shenandoahRuntime.hpp line 42: > 40: static void pre_barrier(JavaThread* thread, oopDesc* orig) { > 41: write_ref_field_pre(orig, thread); > 42: } So, why not export `write_ref_field_pre`, instead of introducing this new method? Style/cleanliness, or something else? I am asking, because every time we add a new stub here, we would need to record it in `AOTCache` tables for Leyden benefit. src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 240: > 238: cardtable_shift = CardTable::card_shift(); > 239: } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) { > 240: cardtable_shift = CardTable::card_shift(); I understand the barrier code does not use `cardtable_start_address`, but should we still initialize it here to `nullptr`? ------------- PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2815217376 PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073674847 PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073678010 From asmehra at openjdk.org Mon May 5 15:52:50 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 5 May 2025 15:52:50 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v3] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 15:04:10 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Add legend > - Merge branch 'master' into JDK-8356027-print-compilation-timings > - Test TestDuplicatedLateInliningOutput.java > - More touchups > - Fix TypeProfileFinalMethod as well > - Fix inline tree printing > - Touchups > - Fix lgtm ------------- Marked as reviewed by asmehra (Committer). PR Review: https://git.openjdk.org/jdk/pull/24984#pullrequestreview-2812413599 From asmehra at openjdk.org Mon May 5 15:52:50 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 5 May 2025 15:52:50 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: References: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> Message-ID: On Mon, 5 May 2025 15:25:19 GMT, Aleksey Shipilev wrote: > So I opted to print it once from compiler initialization code. I think that is good enough. Thanks for adding the legend. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2851432820 From asmehra at openjdk.org Mon May 5 15:52:53 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 5 May 2025 15:52:53 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> References: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> Message-ID: On Thu, 1 May 2025 19:19:32 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Test TestDuplicatedLateInliningOutput.java src/hotspot/share/compiler/compileBroker.cpp line 2354: > 2352: task->mark_finished(os::elapsed_counter()); > 2353: > 2354: if (failure_reason != nullptr) { I see this code has been moved into the block that handles the case when JVMCI is not used. Earlier this code was executed unconditionally. Is this code not applicable for JVMCI case now? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24984#discussion_r2071851335 From rkennke at openjdk.org Mon May 5 15:54:29 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 15:54:29 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2] In-Reply-To: References: Message-ID: > In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. > > Testing: > - [x] extensive testing with https://github.com/oracle/graal/pull/10904 Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Initialize cardtable_start_address to nullptr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25001/files - new: https://git.openjdk.org/jdk/pull/25001/files/6487a9f7..c95313a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25001.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001 PR: https://git.openjdk.org/jdk/pull/25001 From rkennke at openjdk.org Mon May 5 15:54:29 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 15:54:29 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2] In-Reply-To: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> References: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> Message-ID: On Mon, 5 May 2025 15:31:59 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Initialize cardtable_start_address to nullptr > > src/hotspot/share/gc/shenandoah/shenandoahRuntime.hpp line 42: > >> 40: static void pre_barrier(JavaThread* thread, oopDesc* orig) { >> 41: write_ref_field_pre(orig, thread); >> 42: } > > So, why not export `write_ref_field_pre`, instead of introducing this new method? Style/cleanliness, or something else? I am asking, because every time we add a new stub here, we would need to record it in `AOTCache` tables for Leyden benefit. It's about the argument ordering. Graal expects the Thread* to be prependend, while other JITs call it with the Thread* appended. I guess we could change other JIT calls to also prepend the thread, or change the interface to not pass the Thread* at all. I chose to follow G1 and export both variants. > src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 240: > >> 238: cardtable_shift = CardTable::card_shift(); >> 239: } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) { >> 240: cardtable_shift = CardTable::card_shift(); > > I understand the barrier code does not use `cardtable_start_address`, but should we still initialize it here to `nullptr`? Good point, did that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073702873 PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073705091 From shade at openjdk.org Mon May 5 15:55:59 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 15:55:59 GMT Subject: RFR: 8355627: Don't use ThreadCritical for EventLog list [v2] In-Reply-To: References: <2E351OV3CJhTo99AnXZvisLvkuTKrxY7Q_3_P9idRys=.d9e9742c-cb58-4700-8d07-3095004a0625@github.com> Message-ID: <6dgHBCQx6utnhpz22A-kICmE6UnpZzjxmJIVZhBMxug=.06c51857-0bd3-45a5-8bef-21f40f122675@github.com> On Mon, 5 May 2025 13:46:35 GMT, Zhengyu Gu wrote: >> I saw no point in enforcing memory ordering mode here, as it looks like we only did `ThreadCritical` for mutual exclusion. Note that we do not have a matching acquire on list traversals, so seqcst/release on list additions would be incomplete. That only reinforces my original thinking: we are riding on memory ordering given by something else, I'd guess the initialization sequence itself. >> >> But I won't quibble, it is a very minor optimization. > > I don't believe we need reader side barrier, given the `conservative` order on writer side. @shipilev any opinion? Well, if we are going fully pedantic here, we only need to make sure that the `EventLog::_next` is properly visible to the code that traverses the `EventLog` list. This realistically only matters if there are no synchronization points in between those, which I suspect are plenty. E.g. this is the same thing that make all other init code robust in practice: things that happen at init are good to go. If you want to cover the path to be fully 100% bullet-proof and squeeze some performance at the same time, then I think `cmpxchg` to `Events::_logs` should be `release` and loads from there should be `acquire`-s. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24954#discussion_r2073707374 From shade at openjdk.org Mon May 5 16:04:47 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 16:04:47 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: References: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> Message-ID: On Fri, 2 May 2025 16:18:23 GMT, Ashutosh Mehra wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Test TestDuplicatedLateInliningOutput.java > > src/hotspot/share/compiler/compileBroker.cpp line 2354: > >> 2352: task->mark_finished(os::elapsed_counter()); >> 2353: >> 2354: if (failure_reason != nullptr) { > > I see this code has been moved into the block that handles the case when JVMCI is not used. Earlier this code was executed unconditionally. Is this code not applicable for JVMCI case now? I moved this code because we need access to `ciEnv` to get access to inline messages. Their lifecycle also depend on `ciEnv` lifecycle. But you are right, JVMCI does not get here now, which is not what I intended. I'll see what can be done here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24984#discussion_r2073724644 From cslucas at openjdk.org Mon May 5 16:26:49 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 May 2025 16:26:49 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 15:54:29 GMT, Roman Kennke wrote: >> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. >> >> Testing: >> - [x] extensive testing with https://github.com/oracle/graal/pull/10904 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Initialize cardtable_start_address to nullptr LGTM. Thanks. ------------- Marked as reviewed by cslucas (Committer). PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2815365611 From shade at openjdk.org Mon May 5 16:37:00 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 16:37:00 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v4] In-Reply-To: References: Message-ID: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Revert the shared printing block ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24984/files - new: https://git.openjdk.org/jdk/pull/24984/files/4eb0dde9..750ac377 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=02-03 Stats: 32 lines in 1 file changed: 8 ins; 3 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From shade at openjdk.org Mon May 5 16:37:00 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 16:37:00 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v2] In-Reply-To: References: <40AnyQm_eXzMeoVC5lmbs1CaVYkMJwOdfsDxgx7S5t0=.5a73af04-4b5a-4a8d-a4b8-166cfd912977@github.com> Message-ID: On Mon, 5 May 2025 16:00:42 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/compiler/compileBroker.cpp line 2354: >> >>> 2352: task->mark_finished(os::elapsed_counter()); >>> 2353: >>> 2354: if (failure_reason != nullptr) { >> >> I see this code has been moved into the block that handles the case when JVMCI is not used. Earlier this code was executed unconditionally. Is this code not applicable for JVMCI case now? > > I moved this code because we need access to `ciEnv` to get access to inline messages. Their lifecycle also depend on `ciEnv` lifecycle. But you are right, JVMCI does not get here now, which is not what I intended. I'll see what can be done here. Done. We have to pay with one additional copy, which we _can_ optimize if we really wanted, but I don't think we care as much, since this only matters when `-XX:+PrintInlining` is enabled. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24984#discussion_r2073781217 From shade at openjdk.org Mon May 5 16:50:46 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 16:50:46 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2] In-Reply-To: References: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> Message-ID: <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com> On Mon, 5 May 2025 15:49:32 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahRuntime.hpp line 42: >> >>> 40: static void pre_barrier(JavaThread* thread, oopDesc* orig) { >>> 41: write_ref_field_pre(orig, thread); >>> 42: } >> >> So, why not export `write_ref_field_pre`, instead of introducing this new method? Style/cleanliness, or something else? I am asking, because every time we add a new stub here, we would need to record it in `AOTCache` tables for Leyden benefit. > > It's about the argument ordering. Graal expects the Thread* to be prependend, while other JITs call it with the Thread* appended. I guess we could change other JIT calls to also prepend the thread, or change the interface to not pass the Thread* at all. I chose to follow G1 and export both variants. Oh, so this matches `JVMCIRuntime::write_barrier_pre` for G1 (weird place to have it, but oh well). Does Graal need the `Thread*` argument? I think this method is only called when SATB buffer is full. So the performance of this method is likely not affected by getting the current thread down in caller. So I think it would be more straight-forward to sharpen `ShenandoahRuntime::write_ref_field_pre` by dropping `Thread*` and then exporting that. Maybe also under the `SR::write_barrier_pre` name to be even more consistent for everything else. Maybe @JohnTortugo wants to clean up more mess in C2 related to this :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073800305 From shade at openjdk.org Mon May 5 16:50:48 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 16:50:48 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 15:54:29 GMT, Roman Kennke wrote: >> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. >> >> Testing: >> - [x] extensive testing with https://github.com/oracle/graal/pull/10904 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Initialize cardtable_start_address to nullptr src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 137: > 135: ZGC_ONLY(static_field(CompilerToVM::Data, sizeof_ZStoreBarrierEntry, int)) \ > 136: SHENANDOAHGC_ONLY(static_field(CompilerToVM::Data, shenandoah_in_cset_fast_test_addr, address)) \ > 137: SHENANDOAHGC_ONLY(static_field(CompilerToVM::Data, shenandoah_region_size_bytes_shift,int)) \ Also indent trailing backslashes. src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 909: > 907: SHENANDOAHGC_ONLY(declare_function(ShenandoahRuntime::load_reference_barrier_weak_narrow)) \ > 908: SHENANDOAHGC_ONLY(declare_function(ShenandoahRuntime::load_reference_barrier_phantom)) \ > 909: SHENANDOAHGC_ONLY(declare_function(ShenandoahRuntime::load_reference_barrier_phantom_narrow)) \ Also indent trailing backslashes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073801311 PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073801126 From rkennke at openjdk.org Mon May 5 16:58:01 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 16:58:01 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3] In-Reply-To: References: Message-ID: > In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. > > Testing: > - [x] extensive testing with https://github.com/oracle/graal/pull/10904 Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Align backslashes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25001/files - new: https://git.openjdk.org/jdk/pull/25001/files/c95313a9..44344585 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=01-02 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25001.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001 PR: https://git.openjdk.org/jdk/pull/25001 From rkennke at openjdk.org Mon May 5 16:58:01 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 16:58:01 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3] In-Reply-To: <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com> References: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com> Message-ID: On Mon, 5 May 2025 16:46:46 GMT, Aleksey Shipilev wrote: >> It's about the argument ordering. Graal expects the Thread* to be prependend, while other JITs call it with the Thread* appended. I guess we could change other JIT calls to also prepend the thread, or change the interface to not pass the Thread* at all. I chose to follow G1 and export both variants. > > Oh, so this matches `JVMCIRuntime::write_barrier_pre` for G1 (weird place to have it, but oh well). > > Does Graal need the `Thread*` argument? > > I think this method is only called when SATB buffer is full. So the performance of this method is likely not affected by getting the current thread down in caller. So I think it would be more straight-forward to sharpen `ShenandoahRuntime::write_ref_field_pre` by dropping `Thread*` and then exporting that. Maybe also under the `SR::write_barrier_pre` name to be even more consistent for everything else. > > Maybe @JohnTortugo wants to clean up more mess in C2 related to this :) Graal does not need the Thread* argument, but the runtime code behind write_ref_pre() currently uses it. I agree, it does not look performance critical to pass it through. However, getting rid of it seems to blow the scope of this PR. I'd rather do this as a follow-up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073807949 From rkennke at openjdk.org Mon May 5 16:58:02 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 16:58:02 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3] In-Reply-To: References: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com> Message-ID: <_4ebbEF2UTOwozaLgA8ibDcCi1YF5I4rKA1NN3tBozs=.7ce738b0-7e79-4c93-bc4d-63f534ccc536@github.com> On Mon, 5 May 2025 16:51:39 GMT, Roman Kennke wrote: >> Oh, so this matches `JVMCIRuntime::write_barrier_pre` for G1 (weird place to have it, but oh well). >> >> Does Graal need the `Thread*` argument? >> >> I think this method is only called when SATB buffer is full. So the performance of this method is likely not affected by getting the current thread down in caller. So I think it would be more straight-forward to sharpen `ShenandoahRuntime::write_ref_field_pre` by dropping `Thread*` and then exporting that. Maybe also under the `SR::write_barrier_pre` name to be even more consistent for everything else. >> >> Maybe @JohnTortugo wants to clean up more mess in C2 related to this :) > > Graal does not need the Thread* argument, but the runtime code behind write_ref_pre() currently uses it. I agree, it does not look performance critical to pass it through. However, getting rid of it seems to blow the scope of this PR. I'd rather do this as a follow-up. Actually, I'd probably add the new entry for Graal without the Thread* argument now, and fix the others in a follow-up. Otherwise we need to deal with it on the Graal side again later once we change the entry points. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073813072 From shade at openjdk.org Mon May 5 17:03:47 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 May 2025 17:03:47 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3] In-Reply-To: <_4ebbEF2UTOwozaLgA8ibDcCi1YF5I4rKA1NN3tBozs=.7ce738b0-7e79-4c93-bc4d-63f534ccc536@github.com> References: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com> <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com> <_4ebbEF2UTOwozaLgA8ibDcCi1YF5I4rKA1NN3tBozs=.7ce738b0-7e79-4c93-bc4d-63f534ccc536@github.com> Message-ID: On Mon, 5 May 2025 16:55:36 GMT, Roman Kennke wrote: >> Graal does not need the Thread* argument, but the runtime code behind write_ref_pre() currently uses it. I agree, it does not look performance critical to pass it through. However, getting rid of it seems to blow the scope of this PR. I'd rather do this as a follow-up. > > Actually, I'd probably add the new entry for Graal without the Thread* argument now, and fix the others in a follow-up. Otherwise we need to deal with it on the Graal side again later once we change the entry points. OK, but that follow-up risks changing the JVMCI interface _again_? How about we introduce: static void write_barrier_pre(oopDesc* pre_val) { write_ref_field_pre(pre_val, JavaThread::current()); } ...and then the follow-up purges the old `write_ref_field_pre`? The implementation might need to be in `.cpp`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073820137 From mgronlun at openjdk.org Mon May 5 17:33:51 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 17:33:51 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 08:50:35 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338). >> >> Implementations in this change set are provided and have been tested on the following platforms: >> >> - windows-x64 >> - windows-x64-debug >> - linux-x64 >> - linux-x64-debug >> - macosx-x64 >> - macosx-x64-debug >> - linux-aarch64 >> - linux-aarch64-debug >> - macosx-aarch64 >> - macosx-aarch64-debug >> >> Testing: tier1-6, jdk_jfr, stress testing. >> >> Platform porters note: >> Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes: >> >> - src/hotspot/cpu/x86/frame_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.hpp >> - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp >> - src/hotspot/cpu/x86/macroAssembler_x86.cpp >> - src/hotspot/cpu/x86/macroAssembler_x86.hpp >> - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp >> - src/hotspot/cpu/x86/templateTable_x86.cpp >> - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: > > - Merge branch 'master' into 8352251 > - Configuration and test for jdk.SafepointLatency event > - include guards > - push back pd constants into pd code > - Attempt to build Windows-AARCH64 > - No invariants for sender_for_interpreter_frame > - zero > - Merge branch 'master' into 8352251 > - Refine SamplingLatency event description > - Update default.jfc > - ... and 9 more: https://git.openjdk.org/jdk/compare/8511220f...e448090e > > Can we let _last_sender_Java_fp be a state field that can be tested? > > I still couldn't hit any failures or errors with my simple version, but I understand that it might be problematic. > > I have an implementation: [TheRealMDoerr at b2f83fa](https://github.com/TheRealMDoerr/jdk/commit/b2f83fae262f129f864e109d7adce169e28f0c7c) Please take a look. I hope we don't need more ;-) I'm planning to run more test when I find more time. > > If `_last_sender_Java_fp` is needed on all platforms, shouldn't it be better moved to shared javaFrameAnchor.hpp and javaThread.hpp? That is a good reflection. I followed _last_java_FP which is defined in platform specific files. Are we not using _last_java_FP on all platforms? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2851761291 From mgronlun at openjdk.org Mon May 5 17:41:47 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 17:41:47 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 17:31:37 GMT, Markus Gr?nlund wrote: > > Can we let _last_sender_Java_fp be a state field that can be tested? > > I still couldn't hit any failures or errors with my simple version, but I understand that it might be problematic. > > I have an implementation: [TheRealMDoerr at b2f83fa](https://github.com/TheRealMDoerr/jdk/commit/b2f83fae262f129f864e109d7adce169e28f0c7c) Please take a look. I hope we don't need more ;-) I'm planning to run more test when I find more time. > > If `_last_sender_Java_fp` is needed on all platforms, shouldn't it be better moved to shared javaFrameAnchor.hpp and javaThread.hpp? I removed the strongest assertions from jfrThreadSampling.cpp compute_top_frame() and compute_sender_frame(). Still, I would expect you occasionally to crash hard in the JfrVframeStream iterator during stack walking. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2851815322 From asmehra at openjdk.org Mon May 5 17:51:46 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 5 May 2025 17:51:46 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 16:37:00 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert the shared printing block Marked as reviewed by asmehra (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24984#pullrequestreview-2815605291 From lmesnik at openjdk.org Mon May 5 17:55:54 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 5 May 2025 17:55:54 GMT Subject: RFR: 8347004: vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java fails with CDS disabled Message-ID: Test fails with OOME if CDS is disabled. It is not a regression, it just rarely executed in this mode. The fix is just to slightly increase Metaspace. Verified that test now pass with CDS disabled + Xcomp. (It fails with Xcomp only) ------------- Commit messages: - 8347004 Changes: https://git.openjdk.org/jdk/pull/25046/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25046&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347004 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25046.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25046/head:pull/25046 PR: https://git.openjdk.org/jdk/pull/25046 From mgronlun at openjdk.org Mon May 5 18:18:48 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 18:18:48 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 17:38:44 GMT, Markus Gr?nlund wrote: > > Can we let _last_sender_Java_fp be a state field that can be tested? > > I still couldn't hit any failures or errors with my simple version, but I understand that it might be problematic. > > I have an implementation: [TheRealMDoerr at b2f83fa](https://github.com/TheRealMDoerr/jdk/commit/b2f83fae262f129f864e109d7adce169e28f0c7c) Please take a look. I hope we don't need more ;-) I'm planning to run more test when I find more time. > > If `_last_sender_Java_fp` is needed on all platforms, shouldn't it be better moved to shared javaFrameAnchor.hpp and javaThread.hpp? Implementation seems to be in the correct direction, but its missing locations in TemplateInterpreterGenerator and TemplateTable. Also, i do not see any pre-emptive move of the fp, which is required for fetching a correct (non-racy) CPU context. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2851919560 From rvansa at openjdk.org Mon May 5 18:39:49 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 5 May 2025 18:39:49 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 15:00:04 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Move constant to static final var > > Also should probably add your test to test/micro/org/openjdk/bench as a JMH @coleenp I wouldn't mind keeping the order in InstanceKlass (or elsewhere), but it would increase memory usage - and through this go against the idea from https://bugs.openjdk.org/browse/JDK-8292818 that tries to reduce the memory footprint. Would that be acceptable? If methods are already sorted alphabetically, it would make sense for fields, too. > I'm wondering if instead this patch could create a cache of the FieldInfoStream in InstanceKlass for certain heuristics. If we'd have something like a complete n-ary tree or hashtable, it would most likely take several bytes per field. I am not sure how much memory can I sacrifice... Even if we can have the structure only for 'big' classes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2851974225 From mdoerr at openjdk.org Mon May 5 19:50:49 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 19:50:49 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 17:31:37 GMT, Markus Gr?nlund wrote: > Are we not using _last_java_FP on all platforms? The field `_last_Java_fp` is only needed on platforms which don't have a Back Chain. > Implementation seems to be in the correct direction, but its missing locations in TemplateInterpreterGenerator and TemplateTable. Yeah, I'll look into them later when the first part is correct. > Also, i do not see any pre-emptive move of the fp, which is required for fetching a correct (non-racy) CPU context. Isn't this problem already solved by using the `_last_sender_Java_fp`? Why do we need both, the frame pop before the safepoint check and the _last_sender_Java_fp trick? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2852162604 From mgronlun at openjdk.org Mon May 5 20:00:48 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 20:00:48 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 19:48:35 GMT, Martin Doerr wrote: > Isn't this problem already solved by using the `_last_sender_Java_fp`? Why do we need both, the frame pop before the safepoint check and the _last_sender_Java_fp trick? Because the sampler fetches the CPU context for threads running in state "_thread_in_Java". Without it, you can sample after the safepoint poll test is issued, but before the frame is popped. That sampled frame will represent something now being removed, and if the sender issues another call, also overwritten. It may help to think about how the JIT methods works - they pop their frames before issuing the method return safepoint poll test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2852187118 From mdoerr at openjdk.org Mon May 5 20:22:49 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 5 May 2025 20:22:49 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 19:56:43 GMT, Markus Gr?nlund wrote: > > Isn't this problem already solved by using the `_last_sender_Java_fp`? Why do we need both, the frame pop before the safepoint check and the _last_sender_Java_fp trick? > > Because the sampler fetches the CPU context for threads running in state "_thread_in_Java". Without it, you can sample after the safepoint poll test is issued, but before the frame is popped. That sampled frame will represent something now being removed, and if the sender issues another call, also overwritten. > > It may help to think about how the JIT methods works - they pop their frames before issuing the method return safepoint poll test. Ok. This makes sense. Thanks for the explanation! Just trying to understand: What if we have the frame pop before the safepoint check like in JIT compiled code. Do we still need the `_last_sender_Java_fp` trick in this case (assuming we handle it like in the JIT compiled case and use `StackWatermarkSet::after_unwind` in `InterpreterRuntime::at_unwind`)? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2852238016 From rkennke at openjdk.org Mon May 5 20:25:27 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 5 May 2025 20:25:27 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4] In-Reply-To: References: Message-ID: > In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. > > Testing: > - [x] extensive testing with https://github.com/oracle/graal/pull/10904 Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Simplify pre-barrier runtime entry ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25001/files - new: https://git.openjdk.org/jdk/pull/25001/files/44344585..41084f3e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=02-03 Stats: 8 lines in 3 files changed: 4 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25001.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001 PR: https://git.openjdk.org/jdk/pull/25001 From mgronlun at openjdk.org Mon May 5 21:00:50 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 5 May 2025 21:00:50 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 20:19:51 GMT, Martin Doerr wrote: > > > Isn't this problem already solved by using the `_last_sender_Java_fp`? Why do we need both, the frame pop before the safepoint check and the _last_sender_Java_fp trick? > > > > > > Because the sampler fetches the CPU context for threads running in state "_thread_in_Java". Without it, you can sample after the safepoint poll test is issued, but before the frame is popped. That sampled frame will represent something now being removed, and if the sender issues another call, also overwritten. > > It may help to think about how the JIT methods works - they pop their frames before issuing the method return safepoint poll test. > > Ok. This makes sense. Thanks for the explanation! Just trying to understand: What if we have the frame pop before the safepoint check like in JIT compiled code. Do we still need the `_last_sender_Java_fp` trick in this case (assuming we handle it like in the JIT compiled case and use `StackWatermarkSet::after_unwind` in `InterpreterRuntime::at_unwind`)? Yes, because StackWaterMarkSet unwind needs the "current frame" as a starting point (for Interpreter frames, it's using fp instead of sp). That's why the ljf is still set "normally". But we add the _last_java_sender_fp specifically for the JFR sampler at this, and a two other sites. The JFR sampler selects the ljf over the CPU context. But In this case, because of StackWatermarkSet::after_unwind, we cannot use the _last_java_fp StackWatermarkSet is using. That would sample the frame that is about to pop. We need to sample the sender that we are returning to. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2852320337 From kvn at openjdk.org Mon May 5 21:15:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 5 May 2025 21:15:46 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 16:37:00 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert the shared printing block "W=time spent in waiting to be put on compilation queue" I am not sure about this since it is mostly 0s in normal execution. I understand that you already have values and easy to calculate but output become complex and you may forgot what the columns mean. I thought about using `-XX:+Verbose` to gate it but it is debug flag. May be we can print it only for installing AOT code? Changes are fine otherwise. I will test them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2852352326 PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2852353316 From jiangli at openjdk.org Mon May 5 21:36:26 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 5 May 2025 21:36:26 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk Message-ID: Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. gtest/GTestWrapper gtest/LargePageGtests#use-large-pages gtest/LargePageGtests#use-large-pages-1G gtest/LockStackGtests gtest/MetaspaceGtests#no-ccs gtest/NMTGtests#nmt-detail gtest/NMTGtests#nmt-off gtest/NMTGtests#nmt-summary ------------- Commit messages: - Problemlist some gtests on static-jdk. Changes: https://git.openjdk.org/jdk/pull/25050/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25050&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356209 Stats: 11 lines in 1 file changed: 11 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25050.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25050/head:pull/25050 PR: https://git.openjdk.org/jdk/pull/25050 From sspitsyn at openjdk.org Mon May 5 23:21:20 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 5 May 2025 23:21:20 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 14:58:43 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: patch from Patricio with alternate approach > > src/hotspot/share/prims/jvmtiEnv.cpp line 1124: > >> 1122: oop carrier_thread = java_lang_VirtualThread::carrier_thread(thread_oop); >> 1123: java_thread = carrier_thread == nullptr ? nullptr : java_lang_Thread::thread(carrier_thread); >> 1124: } > > Nit: extra spaces at the end. There are a couple of other instances of this in this file shown by jcheck. Fixed now, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2074359226 From lmesnik at openjdk.org Mon May 5 23:37:23 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 5 May 2025 23:37:23 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking Message-ID: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> The failing test is excluded. No plan to fix, so no bugid is used. ------------- Commit messages: - typo - 8356089 Changes: https://git.openjdk.org/jdk/pull/25052/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356089 Stats: 33 lines in 1 file changed: 33 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25052.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25052/head:pull/25052 PR: https://git.openjdk.org/jdk/pull/25052 From iklam at openjdk.org Mon May 5 23:44:14 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 5 May 2025 23:44:14 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes In-Reply-To: <5vOdChaItphSz0dAvDqdniRjHRAAzeUBu2e7rxMkS54=.05079043-e02a-4853-891e-c7d34919af8d@github.com> References: <5vOdChaItphSz0dAvDqdniRjHRAAzeUBu2e7rxMkS54=.05079043-e02a-4853-891e-c7d34919af8d@github.com> Message-ID: On Thu, 1 May 2025 18:52:22 GMT, Andrew Dinn wrote: > @iklam We have seen this problem with Red Hat deployments in jdk24 as well as jdk25-ea. > > I'm saying that mostly for information. However, I do have to ask: If this is fixed for jdk25 is there any question of also fixing it in jdk24? I would be content to receive a no answer -- a similar issue with patch that could be backported from jdk26 -> jdk25 might be something to think about a bit more? Backporting this to jdk 24 would require a lot of effort. I think it might be easier to update CDSHeapVerifier to add more rules to filter out the false positives. CDSHeapVerifier is too conservative and reports error for things that are actually OK. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24956#issuecomment-2852698208 From lmesnik at openjdk.org Mon May 5 23:49:33 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 5 May 2025 23:49:33 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v2] In-Reply-To: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: > The failing test is excluded. > No plan to fix, so no bugid is used. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: fixed name ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25052/files - new: https://git.openjdk.org/jdk/pull/25052/files/52d1ede8..064369bd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25052.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25052/head:pull/25052 PR: https://git.openjdk.org/jdk/pull/25052 From iklam at openjdk.org Mon May 5 23:50:31 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 5 May 2025 23:50:31 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes [v2] In-Reply-To: References: Message-ID: > This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). > > AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) > > In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. > > I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Comments from @liach and @ExE-Boss ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24956/files - new: https://git.openjdk.org/jdk/pull/24956/files/0f6a2e0a..a1e3743b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24956&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24956&range=00-01 Stats: 8 lines in 3 files changed: 5 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24956.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24956/head:pull/24956 PR: https://git.openjdk.org/jdk/pull/24956 From liach at openjdk.org Mon May 5 23:53:17 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 5 May 2025 23:53:17 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes [v2] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 23:50:31 GMT, Ioi Lam wrote: >> This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). >> >> AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) >> >> In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. >> >> I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Comments from @liach and @ExE-Boss The Java code change and the BSM coverage looks good to me. Requiring another reviewer for hotspot changes. ------------- Marked as reviewed by liach (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24956#pullrequestreview-2816431978 From sspitsyn at openjdk.org Mon May 5 23:55:21 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 5 May 2025 23:55:21 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: References: Message-ID: <7kiM8LZK2bDJRk34o6fQHGbf1NZwdHsZTz77dHCt8jo=.9dee85fd-2557-4a6a-baf0-e8df4047c9c2@github.com> On Mon, 5 May 2025 15:06:37 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: patch from Patricio with alternate approach > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1798: > >> 1796: return JVMTI_ERROR_THREAD_SUSPENDED; >> 1797: } >> 1798: if (!java_thread->java_suspend(single_suspend)) { > > We could use `is_virtual && single_suspend` (same in resume) and change `_handshakee->is_vthread_mounted()` to be an assert in `HandshakeState::set_suspended()`. Thank you for suggestion. Let me check if I understand you right. We can rename the parameter `update_vthread_list` to `register_vthread_suspend_or_resume` and pass `is_virtual && single_suspend` instead of `single_suspend` to `java_suspend()` and `java_resume()`. We also want to change the `HandshakeState::set_suspended()` as below: void HandshakeState::set_suspended(bool is_suspend, bool register_vthread_suspend_or_resume) { #if INCLUDE_JVMTI if (register_vthread_suspend_or_resume) { assert(_handshakee->is_vthread_mounted(), "sanity check"); if (is_suspend) { JvmtiVTSuspender::register_vthread_suspend(_handshakee->vthread()); } else { JvmtiVTSuspender::register_vthread_resume(_handshakee->vthread()); } } #endif Atomic::store(&_suspended, is_suspend); } Is this correct? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2074407231 From lmesnik at openjdk.org Mon May 5 23:56:31 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 5 May 2025 23:56:31 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v3] In-Reply-To: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: > The failing test is excluded. > No plan to fix, so no bugid is used. Leonid Mesnik has updated the pull request incrementally with two additional commits since the last revision: - year fixed - spaces updated ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25052/files - new: https://git.openjdk.org/jdk/pull/25052/files/064369bd..62096b16 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25052.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25052/head:pull/25052 PR: https://git.openjdk.org/jdk/pull/25052 From sspitsyn at openjdk.org Tue May 6 00:02:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 00:02:49 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v7] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: fix trailing spaces ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/168c1252..b2c413a7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From lmesnik at openjdk.org Tue May 6 00:12:12 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 00:12:12 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v4] In-Reply-To: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: > The failing test is excluded. > No plan to fix, so no bugid is used. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25052/files - new: https://git.openjdk.org/jdk/pull/25052/files/62096b16..47606397 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25052.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25052/head:pull/25052 PR: https://git.openjdk.org/jdk/pull/25052 From ccheung at openjdk.org Tue May 6 00:42:12 2025 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 6 May 2025 00:42:12 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v4] In-Reply-To: References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: On Tue, 6 May 2025 00:12:12 GMT, Leonid Mesnik wrote: >> The failing test is excluded. >> No plan to fix, so no bugid is used. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fix Hi Leonid, Can you also problem list one more hotspot test? The test failed due to the same reason. diff --git a/test/hotspot/jtreg/ProblemList-AotJdk.txt b/test/hotspot/jtreg/ProblemList-AotJdk.txt index 2528f8d377e..047fc6d33f8 100644 --- a/test/hotspot/jtreg/ProblemList-AotJdk.txt +++ b/test/hotspot/jtreg/ProblemList-AotJdk.txt @@ -3,6 +3,7 @@ runtime/NMT/NMTWithCDS.java 0000000 generic-all runtime/symbols/TestSharedArchiveConfigFile.java 0000000 generic-all gc/arguments/TestSerialHeapSizeFlags.java 0000000 generic-all +gc/arguments/TestCompressedClassFlags.java 0000000 generic-all gc/TestAllocateHeapAtMultiple.java 0000000 generic-all gc/TestAllocateHeapAt.java 0000000 generic-all ------------- PR Comment: https://git.openjdk.org/jdk/pull/25052#issuecomment-2852850728 From zgu at openjdk.org Tue May 6 01:18:23 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Tue, 6 May 2025 01:18:23 GMT Subject: RFR: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: <489rqeUgKYHDcb7DGNGs78WshL8rMYkSeNWCnl-Y3lQ=.5e998628-6654-4da5-99eb-e552f225ee7d@github.com> References: <6XiZmQFwOTtDHxKtxkJgSQOvqAh5zkCOG_kvYpHeRxE=.1bd831a2-9976-4983-be54-eacebea48f29@github.com> <489rqeUgKYHDcb7DGNGs78WshL8rMYkSeNWCnl-Y3lQ=.5e998628-6654-4da5-99eb-e552f225ee7d@github.com> Message-ID: On Mon, 5 May 2025 13:37:49 GMT, Johannes Bechberger wrote: >> Hmmm, `has_operation()` was inlined. > > It was, but it cannot anymore, because it depends on the JfrThreadLocal now. Okay. I guess that you have to move them to new `handshake.inline.hpp` file ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2074494257 From lmesnik at openjdk.org Tue May 6 01:44:29 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 01:44:29 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v5] In-Reply-To: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: > The failing test is excluded. > No plan to fix, so no bugid is used. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: added test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25052/files - new: https://git.openjdk.org/jdk/pull/25052/files/47606397..2b4deef8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25052&range=03-04 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25052.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25052/head:pull/25052 PR: https://git.openjdk.org/jdk/pull/25052 From epavlova at openjdk.org Tue May 6 02:02:21 2025 From: epavlova at openjdk.org (Ekaterina Pavlova) Date: Tue, 6 May 2025 02:02:21 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v5] In-Reply-To: References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: <8ZXxjVM5_Z8enD3PifiITpHdHtbeBo0PEHLboO4DoXk=.2fe46c82-f4f4-48d7-b68b-b7293aedb75c@github.com> On Tue, 6 May 2025 01:44:29 GMT, Leonid Mesnik wrote: >> The failing test is excluded. >> No plan to fix, so no bugid is used. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > added test Thanks for integrating the changes. ------------- Marked as reviewed by epavlova (Committer). PR Review: https://git.openjdk.org/jdk/pull/25052#pullrequestreview-2816656366 From iklam at openjdk.org Tue May 6 02:02:21 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 6 May 2025 02:02:21 GMT Subject: RFR: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking [v5] In-Reply-To: References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: On Tue, 6 May 2025 01:44:29 GMT, Leonid Mesnik wrote: >> The failing test is excluded. >> No plan to fix, so no bugid is used. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > added test LGTM and can be considered as trivial change. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25052#pullrequestreview-2816657202 From lmesnik at openjdk.org Tue May 6 02:02:22 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 02:02:22 GMT Subject: Integrated: 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking In-Reply-To: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> References: <2GJNl0usfhrC1avlAfFDump-GdWzUWNMRl971HhUKVA=.2b70302f-e134-4934-9be3-335ee97c00fc@github.com> Message-ID: On Mon, 5 May 2025 23:24:21 GMT, Leonid Mesnik wrote: > The failing test is excluded. > No plan to fix, so no bugid is used. This pull request has now been integrated. Changeset: 64b58f6a Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/64b58f6a54c1197002527bdb6ba7b48283dc634e Stats: 34 lines in 2 files changed: 34 ins; 0 del; 0 mod 8356089: java/lang/IO/IO.java fails with -XX:+AOTClassLinking Reviewed-by: epavlova, iklam ------------- PR: https://git.openjdk.org/jdk/pull/25052 From vlivanov at openjdk.org Tue May 6 02:48:16 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 6 May 2025 02:48:16 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> Message-ID: <1dojSNtcbxGAxR2KWo_bqJficSV_QsTcWU2xlo3kxE8=.8c9902eb-017b-4aba-a1ae-9b4fb88ba18f@github.com> On Mon, 28 Apr 2025 10:34:49 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > > Before [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), when a released jdk not supportting sleef (for any reason, e.g. low gcc version, intrinsic not supported, rvv not supported, and so on) runs on machine support vector operation (e.g. on riscv, it supports rvv), it can not call into sleef, but will not fail either, it falls back to java scalar version implementation. > But after [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), it will cause an exception thrown at runtime. > > This change the behaviour of existing jdk, and it should not throw exception anyway. > > @iwanowww @RealFYang > > Thanks! Thanks, Li. I think I have a better understanding now how it all works across platforms. I agree with Andrew that it's highly undesireable to have divergence across JDK builds, but I don't have a good understanding how mature toolchain support is when it comes to vector intrinsics SLEEF library relies on. As of now, producing empty native library is the only way to communicate that a vector math library isn't properly built. So, 2 problems to address: (1) JDK build may produce an empty vector math library (and no build failures); (2) vector math library is built unconditionally (there's no way to configure JDK build to skip it). Speaking of SVML, it requires GCC 4.9+ and MSVC 2017. According to JDK build documentation [1], newer toolchain versions are required for JDK build. So, it should be fine to remove GCC version checks. (JDK-8355656 is reported for a JDK build using Clang. I have no idea whether Clang can build SVML stubs or not.) So, the stop-the-gap fix for JDK-8355698 is to disable SLEEF build when toolchain doesn't have proper support. Alternatively, VectorMathLibrary can probe the library to ensure it is not empty [2]. [1] https://github.com/openjdk/jdk/blob/master/doc/building.md [2] diff --git a/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMathLibrary.java b/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMathLibrary.java index 4729235e2d9..9f0abb9afa1 100644 --- a/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMathLibrary.java +++ b/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMathLibrary.java @@ -189,6 +189,16 @@ public boolean isSupported(Operator op, VectorSpecies vspecies) { private static class SLEEF implements Library { static { VectorSupport.loadNativeLibrary("sleef"); + ensureValidLibrary(); + } + + private static void ensureValidLibrary() { + // Probe an arbitrary entry point to ensure the library is not empty. + // JDK build of SLEEF-based vector math library may produce an empty library when + // proper toolchain support is absent. Until it is fixed, ensure the corresponding + // native library is not empty. + // Throws an exception on lookup failure which triggers a switch to Java-based implementation. + LOOKUP.findOrThrow(LIBRARY.symbolName(VectorOperators.SIN, FloatVector.SPECIES_64)); } private static String suffix(VectorShape vshape, boolean isShapeAgnostic) { ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2853122819 From amitkumar at openjdk.org Tue May 6 04:34:24 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 6 May 2025 04:34:24 GMT Subject: RFR: 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames Message-ID: s390 Port for [JDK-8308984](https://bugs.openjdk.org/browse/JDK-8308984). This PR depends on https://github.com/openjdk/jdk/pull/23660. Tier1 tests with fastdebug-vm show no regression. ------------- Commit messages: - Merge branch 'master' into rel_sp - sign extension - 23660 - relativize sp Changes: https://git.openjdk.org/jdk/pull/23690/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23690&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8350308 Stats: 11 lines in 3 files changed: 8 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23690.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23690/head:pull/23690 PR: https://git.openjdk.org/jdk/pull/23690 From sspitsyn at openjdk.org Tue May 6 04:38:14 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 04:38:14 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: <7kiM8LZK2bDJRk34o6fQHGbf1NZwdHsZTz77dHCt8jo=.9dee85fd-2557-4a6a-baf0-e8df4047c9c2@github.com> References: <7kiM8LZK2bDJRk34o6fQHGbf1NZwdHsZTz77dHCt8jo=.9dee85fd-2557-4a6a-baf0-e8df4047c9c2@github.com> Message-ID: On Mon, 5 May 2025 23:53:02 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1798: >> >>> 1796: return JVMTI_ERROR_THREAD_SUSPENDED; >>> 1797: } >>> 1798: if (!java_thread->java_suspend(single_suspend)) { >> >> We could use `is_virtual && single_suspend` (same in resume) and change `_handshakee->is_vthread_mounted()` to be an assert in `HandshakeState::set_suspended()`. > > Thank you for suggestion. Let me check if I understand you right. > We can rename the parameter `update_vthread_list` to `register_vthread_SR` and pass `is_virtual && single_suspend` instead of `single_suspend` to `java_suspend()` and `java_resume()`. > We also want to change the `HandshakeState::set_suspended()` as below: > > void HandshakeState::set_suspended(bool is_suspend, bool register_vthread_SR) { > #if INCLUDE_JVMTI > if (register_vthread_SR) { > assert(_handshakee->is_vthread_mounted(), "sanity check"); > if (is_suspend) { > JvmtiVTSuspender::register_vthread_suspend(_handshakee->vthread()); > } else { > JvmtiVTSuspender::register_vthread_resume(_handshakee->vthread()); > } > } > #endif > Atomic::store(&_suspended, is_suspend); > } > > > Is this correct? If so, then I think it is a good suggestion. It feels like all the `HandshakeState` SR code can be moved from `handshake.?pp` to` jvmtiEnvBase.?pp` as it seems to be a little bit unnatural for `HandshakeState`. The `JvmtiThreadState_lock` or some other lock can be used for waiting in the suspended state. Then some attempts to simplify this code could be made. But it does not look as very important at this point in time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2074721380 From iveresov at openjdk.org Tue May 6 06:31:43 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 6 May 2025 06:31:43 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v13] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 43 commits: - Merge branch 'master' into pp2 - Fix compile - Fix additional issues - Make sure command line flags that affect MDO layout are consistent - Fix semantics change from the previous commit - Port 8355915: [leyden] Crash in MDO clearing the unloaded array type - Fix flag behavior - Fix log tags - Remove the proxy class counter - Address review comments part 2 - ... and 33 more: https://git.openjdk.org/jdk/compare/e09d2e27...7d22a42a ------------- Changes: https://git.openjdk.org/jdk/pull/24886/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=12 Stats: 3231 lines in 60 files changed: 3011 ins; 103 del; 117 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From lucy at openjdk.org Tue May 6 07:24:13 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 6 May 2025 07:24:13 GMT Subject: RFR: 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Wed, 19 Feb 2025 09:29:03 GMT, Amit Kumar wrote: > s390 Port for [JDK-8308984](https://bugs.openjdk.org/browse/JDK-8308984). > > This PR depends on https://github.com/openjdk/jdk/pull/23660. > > > Tier1 tests with fastdebug-vm show no regression. LGTM. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23690#pullrequestreview-2817145618 From mgronlun at openjdk.org Tue May 6 07:43:21 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 6 May 2025 07:43:21 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Mon, 5 May 2025 20:57:52 GMT, Markus Gr?nlund wrote: > > > Isn't this problem already solved by using the `_last_sender_Java_fp`? Why do we need both, the frame pop before the safepoint check and the _last_sender_Java_fp trick? > > > > > > Because the sampler fetches the CPU context for threads running in state "_thread_in_Java". Without it, you can sample after the safepoint poll test is issued, but before the frame is popped. That sampled frame will represent something now being removed, and if the sender issues another call, also overwritten. > > > > > > It may help to think about how the JIT methods works - they pop their frames before issuing the method return safepoint poll test. > > > > Ok. This makes sense. Thanks for the explanation! > > Just trying to understand: What if we have the frame pop before the safepoint check like in JIT compiled code. Do we still need the `_last_sender_Java_fp` trick in this case (assuming we handle it like in the JIT compiled case and use `StackWatermarkSet::after_unwind` in `InterpreterRuntime::at_unwind`)? Sorry Martin, I did not read your reply detailed enough. That is surely an interesting idea - I did not want to shake things too much, but now I will attempt to try it. Could solve the trick quite naturally, as you say. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2853568317 From shade at openjdk.org Tue May 6 08:15:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 May 2025 08:15:16 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 20:25:27 GMT, Roman Kennke wrote: >> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. >> >> Testing: >> - [x] extensive testing with https://github.com/oracle/graal/pull/10904 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Simplify pre-barrier runtime entry All right, this works, thanks! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2817308091 From amitkumar at openjdk.org Tue May 6 08:24:12 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 6 May 2025 08:24:12 GMT Subject: RFR: 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Wed, 19 Feb 2025 09:29:03 GMT, Amit Kumar wrote: > s390 Port for [JDK-8308984](https://bugs.openjdk.org/browse/JDK-8308984). > > This PR depends on https://github.com/openjdk/jdk/pull/23660. > > > Tier1 tests with fastdebug-vm show no regression. Hi @reinrich, would it be possible for you to review this short change ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23690#issuecomment-2853677106 From sspitsyn at openjdk.org Tue May 6 08:36:24 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 08:36:24 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode Message-ID: This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: - The `interp_only_mode` in `JavaThread` is represented with a counter which is increment and decremented. This is confusing because this value should only take values `0` and `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. Testing: - TBD: Mach5 tiers 1-6 ------------- Commit messages: - 8356251: Need minor cleanup for interp_only_mode Changes: https://git.openjdk.org/jdk/pull/25060/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25060&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356251 Stats: 11 lines in 3 files changed: 1 ins; 4 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25060.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25060/head:pull/25060 PR: https://git.openjdk.org/jdk/pull/25060 From sspitsyn at openjdk.org Tue May 6 08:42:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 08:42:29 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v8] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: some minor refactoring for parameter update_vthread_list ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/b2c413a7..0a29d0b5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=06-07 Stats: 27 lines in 5 files changed: 1 ins; 0 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From sspitsyn at openjdk.org Tue May 6 08:42:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 08:42:29 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: References: <7kiM8LZK2bDJRk34o6fQHGbf1NZwdHsZTz77dHCt8jo=.9dee85fd-2557-4a6a-baf0-e8df4047c9c2@github.com> Message-ID: <5cS67w1ai1kTFNpr9XWnFewGFhsLSSBxhOMFdrAUT1Y=.78ff4f27-e34a-4485-8f27-9b599c88f764@github.com> On Tue, 6 May 2025 04:35:27 GMT, Serguei Spitsyn wrote: >> Thank you for suggestion. Let me check if I understand you right. >> We can rename the parameter `update_vthread_list` to `register_vthread_SR` and pass `is_virtual && single_suspend` instead of `single_suspend` to `java_suspend()` and `java_resume()`. >> We also want to change the `HandshakeState::set_suspended()` as below: >> >> void HandshakeState::set_suspended(bool is_suspend, bool register_vthread_SR) { >> #if INCLUDE_JVMTI >> if (register_vthread_SR) { >> assert(_handshakee->is_vthread_mounted(), "sanity check"); >> if (is_suspend) { >> JvmtiVTSuspender::register_vthread_suspend(_handshakee->vthread()); >> } else { >> JvmtiVTSuspender::register_vthread_resume(_handshakee->vthread()); >> } >> } >> #endif >> Atomic::store(&_suspended, is_suspend); >> } >> >> >> Is this correct? If so, then I think it is a good suggestion. > > It feels like all the `HandshakeState` SR code can be moved from `handshake.?pp` to` jvmtiEnvBase.?pp` as it seems to be a little bit unnatural for `HandshakeState`. The `JvmtiThreadState_lock` or some other lock can be used for waiting in the suspended state. Then some attempts to simplify this code could be made. But it does not look as very important at this point in time. I've pushed the fix suggested above. Please, let me know if it looks right or not. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2074995937 From shade at openjdk.org Tue May 6 09:20:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 May 2025 09:20:29 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v5] In-Reply-To: References: Message-ID: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: - Only record non-empty inlining messages - Merge W+Q => Q ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24984/files - new: https://git.openjdk.org/jdk/pull/24984/files/750ac377..18da4d7c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=03-04 Stats: 23 lines in 4 files changed: 2 ins; 12 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From shade at openjdk.org Tue May 6 09:31:31 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 May 2025 09:31:31 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v6] In-Reply-To: References: Message-ID: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Also handle UL printing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24984/files - new: https://git.openjdk.org/jdk/pull/24984/files/18da4d7c..90462389 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=04-05 Stats: 26 lines in 3 files changed: 15 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From shade at openjdk.org Tue May 6 09:31:32 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 May 2025 09:31:32 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 21:13:08 GMT, Vladimir Kozlov wrote: > "W=time spent in waiting to be put on compilation queue" > > I am not sure about this since it is mostly 0s in normal execution. I understand that you already have values and easy to calculate but output become complex and you may forgot what the columns mean. Yeah, `W` is not always `0`, but most of the time it is. I agree it makes the logging noisier. We can merge `W` and `Q` into `Q`, i.e. saying that the time spent waiting for queue insert is the time spend in queuing. See new commit. > I thought about using `-XX:+Verbose` to gate it but it is debug flag. May be we can print it only for installing AOT code? I think the queuing delays affect normal JIT compilations as well, so it would be nice to surface them for all compilations. I would also like to make this info universally available without any other magic incantations. So I would prefer not to gate this by `Verbose` flag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2853874789 From mgronlun at openjdk.org Tue May 6 09:44:16 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 6 May 2025 09:44:16 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: On Tue, 6 May 2025 07:40:25 GMT, Markus Gr?nlund wrote: > > > > Isn't this problem already solved by using the `_last_sender_Java_fp`? Why do we need both, the frame pop before the safepoint check and the _last_sender_Java_fp trick? > > > > > > > > > > > > > > Because the sampler fetches the CPU context for threads running in state "_thread_in_Java". Without it, you can sample after the safepoint poll test is issued, but before the frame is popped. That sampled frame will represent something now being removed, and if the sender issues another call, also overwritten. > > > > > > > > > > > > > > It may help to think about how the JIT methods works - they pop their frames before issuing the method return safepoint poll test. > > > > > > Ok. This makes sense. Thanks for the explanation! > > Just trying to understand: What if we have the frame pop before the safepoint check like in JIT compiled code. Do we still need the `_last_sender_Java_fp` trick in this case (assuming we handle it like in the JIT compiled case and use `StackWatermarkSet::after_unwind` in `InterpreterRuntime::at_unwind`)? > > Sorry Martin, I did not read your reply detailed enough. That is surely an interesting idea - I did not want to shake things too much, but now I will attempt to try it. > > Could solve the trick quite naturally, as you say. I remember now why I designed it this way. The reason is the other part of the solution, the hook to process enqueued sample requests. As you can see, in Interpreter::unwind(), there is a check for processing sample requests, like: JFR_ONLY(Jfr::check_and_process_sample_request(current);) The invariant here is that the frame about to be popped could have been sampled; therefore, an ljf at this point must "cover" it for stackwalking to locate it (it must be above or equal). If we pop before testing the safepoint poll, that frame is gone (now below the saved ljf). Comparing again with a JIT frame, to be more exact, it is not true that the compiled frame is popped before the method return poll check: Specifically, only the explicit frame size is popped before issuing the return poll test. That is, the frame's return address is still on the stack. The reason this works is because JIT frame unwind test the sp. Therefore, it is possible capture the top frame even though it has been (partially) popped. We reconstruct it in the safepoint handler using the java_thread->saved_exception_pc(). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2853907459 From jbhateja at openjdk.org Tue May 6 09:55:21 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 6 May 2025 09:55:21 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 12:28:55 GMT, Jatin Bhateja wrote: > This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 > > ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] > > In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. > > This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. > > Please review and share your feedback. > > Best Regards, > Jatin > > [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 > > > PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > Member Hi @xmas92, Your suggestion looks good to me for this bugfix. I think we can improve upon the existing implementation as part of JDK-8355341 since its a bigger change and also include graal byein. There is still a possibility of incorrect relocation sharing with subsequent relocatable instructions in other cases, e.g. OR instruction for which we bookkeep the relocation address from the end of the instruction, and it's the last instruction in the pointer coloring primitive. For this bug fix, your suggestion looks fine to me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24919#issuecomment-2853945841 From jbhateja at openjdk.org Tue May 6 10:21:54 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 6 May 2025 10:21:54 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding [v2] In-Reply-To: References: Message-ID: > This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 > > ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] > > In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. > > This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. > > Please review and share your feedback. > > Best Regards, > Jatin > > [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 > > > PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24919/files - new: https://git.openjdk.org/jdk/pull/24919/files/1f9c84c8..fc3b61e7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24919&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24919&range=00-01 Stats: 25 lines in 4 files changed: 11 ins; 7 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24919.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24919/head:pull/24919 PR: https://git.openjdk.org/jdk/pull/24919 From rrich at openjdk.org Tue May 6 11:07:17 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 6 May 2025 11:07:17 GMT Subject: RFR: 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Wed, 19 Feb 2025 09:29:03 GMT, Amit Kumar wrote: > s390 Port for [JDK-8308984](https://bugs.openjdk.org/browse/JDK-8308984). > > This PR depends on https://github.com/openjdk/jdk/pull/23660. > > > Tier1 tests with fastdebug-vm show no regression. Looks good. Richard. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23690#pullrequestreview-2817854548 From rkennke at openjdk.org Tue May 6 11:11:19 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 6 May 2025 11:11:19 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 20:25:27 GMT, Roman Kennke wrote: >> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. >> >> Testing: >> - [x] extensive testing with https://github.com/oracle/graal/pull/10904 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Simplify pre-barrier runtime entry Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25001#issuecomment-2854170217 From rkennke at openjdk.org Tue May 6 11:11:19 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 6 May 2025 11:11:19 GMT Subject: Integrated: 8356075: Support Shenandoah GC in JVMCI In-Reply-To: References: Message-ID: On Fri, 2 May 2025 10:35:03 GMT, Roman Kennke wrote: > In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. > > Testing: > - [x] extensive testing with https://github.com/oracle/graal/pull/10904 This pull request has now been integrated. Changeset: 614ba9fc Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/614ba9fc41a0274a31f0e8eff8a598a7c5afe164 Stats: 62 lines in 7 files changed: 61 ins; 0 del; 1 mod 8356075: Support Shenandoah GC in JVMCI Reviewed-by: shade, dnsimon, cslucas ------------- PR: https://git.openjdk.org/jdk/pull/25001 From coleenp at openjdk.org Tue May 6 11:44:16 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 6 May 2025 11:44:16 GMT Subject: RFR: 8347004: vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java fails with CDS disabled In-Reply-To: References: Message-ID: On Mon, 5 May 2025 17:51:13 GMT, Leonid Mesnik wrote: > Test fails with OOME if CDS is disabled. It is not a regression, it just rarely executed in this mode. > The fix is just to slightly increase Metaspace. > Verified that test now pass with CDS disabled + Xcomp. (It fails with Xcomp only) So this still passes with CDS on though? I shrunk this one with ad104932e6c26806c353ad048ce5cff7d2b4c29a to 10. There's another test that I increased MaxMetaspaceSize for valhalla but I can't find it right now. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25046#pullrequestreview-2817944178 From dnsimon at openjdk.org Tue May 6 11:56:23 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 6 May 2025 11:56:23 GMT Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 20:25:27 GMT, Roman Kennke wrote: >> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols. >> >> Testing: >> - [x] extensive testing with https://github.com/oracle/graal/pull/10904 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Simplify pre-barrier runtime entry src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 239: > 237: cardtable_start_address = base; > 238: cardtable_shift = CardTable::card_shift(); > 239: } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) { This change is causing a failure in mach5 tier 1: [2025-05-06T11:34:44,742Z] /workspace/open/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp:239:35: error: no member named 'ShenandoahBarrierSet' in 'BarrierSet' [2025-05-06T11:34:44,742Z] } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) { [2025-05-06T11:34:44,742Z] ~~~~~~~~~~~~^ [2025-05-06T11:34:45,729Z] 1 error generated. I assume it's missing `#if INCLUDE_SHENANDOAHGC`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2075304100 From stuefe at openjdk.org Tue May 6 12:27:23 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 6 May 2025 12:27:23 GMT Subject: RFR: 8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: <5jsZCyw6wJRG5qhYcV9VqgFPFyEdoi8SY6k75YYPbSM=.1c307466-964b-4f5c-a1ab-a57cbafbbabc@github.com> On Wed, 6 Dec 2023 08:13:55 GMT, Thomas Stuefe wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Add specific percentage switch Not now, bot ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-2854383878 From kvn at openjdk.org Tue May 6 14:24:16 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 6 May 2025 14:24:16 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v6] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 09:31:31 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Also handle UL printing This looks good. I need to re-test it with your latest changes. ------------- PR Review: https://git.openjdk.org/jdk/pull/24984#pullrequestreview-2816099434 From kvn at openjdk.org Tue May 6 14:24:18 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 6 May 2025 14:24:18 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 16:37:00 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert the shared printing block src/hotspot/share/compiler/compileTask.cpp line 235: > 233: } else { > 234: st->print("%7s ", ""); > 235: } Can this be a function since you repeat the code 3 times ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24984#discussion_r2074196818 From kvn at openjdk.org Tue May 6 14:24:18 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 6 May 2025 14:24:18 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v4] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 21:01:10 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert the shared printing block > > src/hotspot/share/compiler/compileTask.cpp line 235: > >> 233: } else { >> 234: st->print("%7s ", ""); >> 235: } > > Can this be a function since you repeat the code 3 times No need new function since it is only 2 times now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24984#discussion_r2075593405 From shade at openjdk.org Tue May 6 14:50:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 May 2025 14:50:14 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v6] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 09:31:31 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Also handle UL printing Linux AArch64 server fastdebug, `make test TEST=all` is green for me here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2854852406 From lmesnik at openjdk.org Tue May 6 15:20:14 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 15:20:14 GMT Subject: RFR: 8347004: vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java fails with CDS disabled In-Reply-To: References: Message-ID: On Mon, 5 May 2025 17:51:13 GMT, Leonid Mesnik wrote: > Test fails with OOME if CDS is disabled. It is not a regression, it just rarely executed in this mode. > The fix is just to slightly increase Metaspace. > Verified that test now pass with CDS disabled + Xcomp. (It fails with Xcomp only) yes, still pass, it takes 3 seconds instead of 2 on my laptop. But nont-CDS configuration worth to test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25046#issuecomment-2854971703 From pchilanomate at openjdk.org Tue May 6 16:29:17 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 6 May 2025 16:29:17 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v8] In-Reply-To: References: Message-ID: <1yOSNB0lT2rNZkbYEnLvH7LwM_DeWSE78M90txGAzck=.6c8a700f-aef1-4d0c-9cee-f8d02ffc1f7c@github.com> On Tue, 6 May 2025 08:42:29 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: some minor refactoring for parameter update_vthread_list I only see some extra trailing spaces in file `src/hotspot/share/prims/jvmtiEnvBase.cpp` but otherwise changes look good! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2855207210 From pchilanomate at openjdk.org Tue May 6 16:29:18 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 6 May 2025 16:29:18 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v6] In-Reply-To: <5cS67w1ai1kTFNpr9XWnFewGFhsLSSBxhOMFdrAUT1Y=.78ff4f27-e34a-4485-8f27-9b599c88f764@github.com> References: <7kiM8LZK2bDJRk34o6fQHGbf1NZwdHsZTz77dHCt8jo=.9dee85fd-2557-4a6a-baf0-e8df4047c9c2@github.com> <5cS67w1ai1kTFNpr9XWnFewGFhsLSSBxhOMFdrAUT1Y=.78ff4f27-e34a-4485-8f27-9b599c88f764@github.com> Message-ID: On Tue, 6 May 2025 08:36:44 GMT, Serguei Spitsyn wrote: >> It feels like all the `HandshakeState` SR code can be moved from `handshake.?pp` to` jvmtiEnvBase.?pp` as it seems to be a little bit unnatural for `HandshakeState`. The `JvmtiThreadState_lock` or some other lock can be used for waiting in the suspended state. Then some attempts to simplify this code could be made. But it does not look as very important at this point in time. > > I've pushed the fix suggested above. Please, let me know if it looks right or not. Great. Yes, that was the suggestion. I didn't think about renaming the parameter but I like the new name. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2075844072 From sspitsyn at openjdk.org Tue May 6 17:04:58 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 17:04:58 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v9] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: reivew: fix more trailing spaces ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/0a29d0b5..4023a19c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=07-08 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From sspitsyn at openjdk.org Tue May 6 17:04:58 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 17:04:58 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v8] In-Reply-To: <1yOSNB0lT2rNZkbYEnLvH7LwM_DeWSE78M90txGAzck=.6c8a700f-aef1-4d0c-9cee-f8d02ffc1f7c@github.com> References: <1yOSNB0lT2rNZkbYEnLvH7LwM_DeWSE78M90txGAzck=.6c8a700f-aef1-4d0c-9cee-f8d02ffc1f7c@github.com> Message-ID: <4nriveMHOhyJpwbJh8iSmGwNJSxsEynlcTDZE9_bsxw=.6dead211-af73-4db3-bbcd-0d41bbf54e70@github.com> On Tue, 6 May 2025 16:26:19 GMT, Patricio Chilano Mateo wrote: > I only see some extra trailing spaces in file src/hotspot/share/prims/jvmtiEnvBase.cpp but otherwise changes look good! Thanks, I've fixed trailing spaces now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2855303446 From pchilanomate at openjdk.org Tue May 6 17:40:17 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 6 May 2025 17:40:17 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v9] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 17:04:58 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > reivew: fix more trailing spaces Thanks Serguei! ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24269#pullrequestreview-2819126419 From ihse at openjdk.org Tue May 6 17:50:14 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 6 May 2025 17:50:14 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: On Mon, 5 May 2025 20:53:15 GMT, Jiangli Zhou wrote: > Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. > > gtest/GTestWrapper > gtest/LargePageGtests#use-large-pages > gtest/LargePageGtests#use-large-pages-1G > gtest/LockStackGtests > gtest/MetaspaceGtests#no-ccs > gtest/NMTGtests#nmt-detail > gtest/NMTGtests#nmt-off > gtest/NMTGtests#nmt-summary This PR failed on GHA. Is that unrelated? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25050#issuecomment-2855428600 From lmesnik at openjdk.org Tue May 6 17:50:26 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 17:50:26 GMT Subject: RFR: 8344270: Update tier1_common and hotspot_misc groups to better organize hotspot non-component tests Message-ID: Can you please review following PR that improve test groups. The bug was originally filed to eliminate duplication between tier1_common and hotspot_misc test groups. However while looked on the test content of these groups I realized that there are some other issues. 1) hotspot_resourcehogs groups should be executed always separately from other tests to don't cause intermittent failures. 2) it makes sense to run all gtest tests in tier1 but don't run in any other tiers (with any VM flags) 3) testlibrary_tests and sources should be a separate groups that don't need to be executed with any VM flags, or event with all builds So tier1_common includes non-component tests that should be executed in tier1. **all** sanity tests **all** gttest tests (were not all of them) testlibrary_tests (might be os/cpu specifc, so need to run them with all builds) source code checking tests (no need to run them an all builds, but it takes only few seconds) And it doesn't makes any sense to execute tier1_common with any external VM flags. While hotspot_misc now includes on 2 sanity tests. It doesn't looks useful, but main purpose for this group would be to catch all tests that somehow missed from other groups. So let keep it. The new test groups were added mostly to add comments explaining their specific. ------------- Commit messages: - fixed comment - changed groups Changes: https://git.openjdk.org/jdk/pull/25070/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25070&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8344270 Stats: 31 lines in 1 file changed: 21 ins; 4 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25070.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25070/head:pull/25070 PR: https://git.openjdk.org/jdk/pull/25070 From lmesnik at openjdk.org Tue May 6 17:50:26 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 17:50:26 GMT Subject: RFR: 8344270: Update tier1_common and hotspot_misc groups to better organize hotspot non-component tests In-Reply-To: References: Message-ID: On Tue, 6 May 2025 17:43:51 GMT, Leonid Mesnik wrote: > Can you please review following PR that improve test groups. > The bug was originally filed to eliminate duplication between tier1_common and hotspot_misc test groups. However while looked on the test content of these groups I realized that there are some other issues. > 1) hotspot_resourcehogs groups should be executed always separately from other tests to don't cause intermittent failures. > 2) it makes sense to run all gtest tests in tier1 but don't run in any other tiers (with any VM flags) > 3) testlibrary_tests and sources should be a separate groups that don't need to be executed with any VM flags, or event with all builds > > So tier1_common includes non-component tests that should be executed in tier1. > **all** sanity tests > **all** gttest tests (were not all of them) > testlibrary_tests (might be os/cpu specifc, so need to run them with all builds) > source code checking tests (no need to run them an all builds, but it takes only few seconds) > > And it doesn't makes any sense to execute tier1_common with any external VM flags. > > While hotspot_misc now includes on 2 sanity tests. It doesn't looks useful, but main purpose for this group would be to catch all tests that somehow missed from other groups. So let keep it. > > The new test groups were added mostly to add comments explaining their specific. The tier1 test results should be part of this PR but is in progress yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25070#issuecomment-2855421723 From jiangli at openjdk.org Tue May 6 17:56:14 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 6 May 2025 17:56:14 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: On Tue, 6 May 2025 17:47:44 GMT, Magnus Ihse Bursie wrote: > This PR failed on GHA. Is that unrelated? Error: Missing download info for actions/checkout at v4 Looks like unrelated. I just started rerunning failed job. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25050#issuecomment-2855442625 From sspitsyn at openjdk.org Tue May 6 18:12:14 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 18:12:14 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v9] In-Reply-To: References: Message-ID: <7wrFLK505ojLkD9Qaeo52Zf_XTa28riurIR2bKFnYwM=.7ca1c72c-1bd7-4850-bad2-eb40993e0ea3@github.com> On Tue, 6 May 2025 17:04:58 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > reivew: fix more trailing spaces Thank you a lot for review and suggestions, Patricio! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2855488652 From lmesnik at openjdk.org Tue May 6 18:32:14 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 18:32:14 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode In-Reply-To: References: Message-ID: <8-jSItxSwhqU969dSkiClU-YDK-nUULgh1bcnS9HMdg=.6e6912e4-45fb-48ea-b62f-8d0286088283@github.com> On Tue, 6 May 2025 08:29:36 GMT, Serguei Spitsyn wrote: > This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: > - The `interp_only_mode` in `JavaThread` is represented with a counter which is increment and decremented. This is confusing because this value should only take values `0` and `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. > - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. > > Testing: > - TBD: Mach5 tiers 1-6 Changes requested by lmesnik (Reviewer). src/hotspot/share/runtime/javaThread.hpp line 1177: > 1175: bool is_interp_only_mode() { return (_interp_only_mode != 0); } > 1176: int get_interp_only_mode() { return _interp_only_mode; } > 1177: int set_interp_only_mode(int val) { return _interp_only_mode = val; } Ther get_interp_only_mode() /set_interp_only_mode(int val) also might be eliminated and replaced by set/clear instead. ------------- PR Review: https://git.openjdk.org/jdk/pull/25060#pullrequestreview-2819254877 PR Review Comment: https://git.openjdk.org/jdk/pull/25060#discussion_r2076039987 From jiangli at openjdk.org Tue May 6 18:37:13 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 6 May 2025 18:37:13 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: On Mon, 5 May 2025 20:53:15 GMT, Jiangli Zhou wrote: > Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. > > gtest/GTestWrapper > gtest/LargePageGtests#use-large-pages > gtest/LargePageGtests#use-large-pages-1G > gtest/LockStackGtests > gtest/MetaspaceGtests#no-ccs > gtest/NMTGtests#nmt-detail > gtest/NMTGtests#nmt-off > gtest/NMTGtests#nmt-summary Rerun passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25050#issuecomment-2855551279 From cjplummer at openjdk.org Tue May 6 18:51:18 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 6 May 2025 18:51:18 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v13] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 06:31:43 GMT, Igor Veresov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 43 commits: > > - Merge branch 'master' into pp2 > - Fix compile > - Fix additional issues > - Make sure command line flags that affect MDO layout are consistent > - Fix semantics change from the previous commit > - Port 8355915: [leyden] Crash in MDO clearing the unloaded array type > - Fix flag behavior > - Fix log tags > - Remove the proxy class counter > - Address review comments part 2 > - ... and 33 more: https://git.openjdk.org/jdk/compare/e09d2e27...7d22a42a src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/memory/FileMapInfo.java line 129: > 127: metadataTypeArray[5] = db.lookupType("InstanceStackChunkKlass"); > 128: metadataTypeArray[6] = db.lookupType("Method"); > 129: metadataTypeArray[9] = db.lookupType("MethodData"); It looks like MethodData inheriting from Metadata is not a new change, but has always been the case. I'm surprised this didn't cause any test failures before your changes. Did you end up with test failures after your changes? src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 154: > 152: if (!VM.getVM().isCore()) { > 153: virtualConstructor.addMapping("CompilerThread", CompilerThread.class); > 154: virtualConstructor.addMapping("TrainingReplayThread", TrainingReplayThread.class); The new SA TrainingReplayThread class is not needed since it only overrides isHiddenFromExternalView() to return true. You can instead use HiddenJavaThread.class here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2076064357 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2076058595 From shade at openjdk.org Tue May 6 19:31:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 May 2025 19:31:17 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache In-Reply-To: References: Message-ID: On Mon, 5 May 2025 00:10:38 GMT, Ioi Lam wrote: > When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. > > However, we have found two cases when the above scheme doesn't work. Please see the new test cases. > > The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. I think there is a generic problem in storing the entirety of `StringTable`: we may end up storing _a lot_ of excess `String`-s, and thus blow up the CDS archive size. I've known libraries that (ab)used `String.intern` for string deduplication, storing millions of `String`-s there. Do these bugs only reasonably affect potentially interned `String`-s that are reachable from `static` fields of initialized classes? Can we somehow only store those? src/hotspot/share/cds/heapShared.cpp line 609: > 607: > 608: void HeapShared::archive_strings() { > 609: oop shared_strings_array = StringTable::init_shared_strings_array(); I see the old comment here that we always succeed, because `StringTable::init_shared_table` does not create any large arrays. Is this still true? I see this in `StringTable::allocate_shared_strings_array`: if (ArchiveHeapWriter::is_too_large_to_archive(secondary_array_size)) { // This can only happen if you have an extremely large number of classes that // refer to more than 16384 * 16384 = 26M interned strings! Not a practical concern // but bail out for safety. log_error(cds)("Too many strings to be archived: %zu", _items_count); MetaspaceShared::unrecoverable_writing_error(); } If we archive the _entirety_ of `StringTable` now, then it is plausible we could archive > 26M Strings now? Maybe write a stress test to see that we are properly failing out of that? Can be (should be?) a follow-up. test/hotspot/jtreg/runtime/cds/appcds/aotClassLinking/NonFinalStaticWithInitVal.java line 64: > 62: > 63: class MyTestApp { > 64: volatile static int x = 0; Seems unused? ------------- PR Review: https://git.openjdk.org/jdk/pull/25026#pullrequestreview-2814332092 PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2075923092 PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2073165457 From alanb at openjdk.org Tue May 6 19:32:13 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 6 May 2025 19:32:13 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: On Mon, 5 May 2025 20:53:15 GMT, Jiangli Zhou wrote: > Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. > > gtest/GTestWrapper > gtest/LargePageGtests#use-large-pages > gtest/LargePageGtests#use-large-pages-1G > gtest/LockStackGtests > gtest/MetaspaceGtests#no-ccs > gtest/NMTGtests#nmt-detail > gtest/NMTGtests#nmt-off > gtest/NMTGtests#nmt-summary Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25050#pullrequestreview-2819395158 From ihse at openjdk.org Tue May 6 19:32:13 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 6 May 2025 19:32:13 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: <1W70x8kjBmlHDYafChY55bBxPeSN-HcFfbvYYGswXEE=.64a5f2e4-febf-4c99-9abd-a9393d431f31@github.com> On Mon, 5 May 2025 20:53:15 GMT, Jiangli Zhou wrote: > Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. > > gtest/GTestWrapper > gtest/LargePageGtests#use-large-pages > gtest/LargePageGtests#use-large-pages-1G > gtest/LockStackGtests > gtest/MetaspaceGtests#no-ccs > gtest/NMTGtests#nmt-detail > gtest/NMTGtests#nmt-off > gtest/NMTGtests#nmt-summary Is the plan to fix these tests to work properly on static-jdk, or to permanently leave them on the problemlist? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25050#issuecomment-2855695539 From lmesnik at openjdk.org Tue May 6 19:33:14 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 6 May 2025 19:33:14 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v9] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 17:04:58 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > reivew: fix more trailing spaces Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24269#pullrequestreview-2819396994 From jiangli at openjdk.org Tue May 6 19:41:19 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 6 May 2025 19:41:19 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: <1W70x8kjBmlHDYafChY55bBxPeSN-HcFfbvYYGswXEE=.64a5f2e4-febf-4c99-9abd-a9393d431f31@github.com> References: <1W70x8kjBmlHDYafChY55bBxPeSN-HcFfbvYYGswXEE=.64a5f2e4-febf-4c99-9abd-a9393d431f31@github.com> Message-ID: On Tue, 6 May 2025 19:28:15 GMT, Magnus Ihse Bursie wrote: > Is the plan to fix these tests to work properly on static-jdk, or to permanently leave them on the problemlist? Let's follow up on that in https://bugs.openjdk.org/browse/JDK-8356201. I think this is similar to test launcher executable (i.e. custom launcher executable) requiring `libjvm` and other JDK native library dependencies. Build system changes are need to build the test executables statically linking with needed JDK/VM native libraries. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25050#issuecomment-2855735943 From jiangli at openjdk.org Tue May 6 19:41:20 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 6 May 2025 19:41:20 GMT Subject: RFR: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: On Tue, 6 May 2025 19:29:34 GMT, Alan Bateman wrote: >> Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. >> >> gtest/GTestWrapper >> gtest/LargePageGtests#use-large-pages >> gtest/LargePageGtests#use-large-pages-1G >> gtest/LockStackGtests >> gtest/MetaspaceGtests#no-ccs >> gtest/NMTGtests#nmt-detail >> gtest/NMTGtests#nmt-off >> gtest/NMTGtests#nmt-summary > > Marked as reviewed by alanb (Reviewer). Thanks for the review/approve, @AlanBateman! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25050#issuecomment-2855737565 From jiangli at openjdk.org Tue May 6 19:41:20 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 6 May 2025 19:41:20 GMT Subject: Integrated: 8356209: Problemlist failed gtests on static-jdk In-Reply-To: References: Message-ID: <1lx_0osKEU3aQHQF_UckjzwyGZ1hndpwiNLPKRru-90=.3e588276-9784-47cf-990c-d3eaaad3f7bc@github.com> On Mon, 5 May 2025 20:53:15 GMT, Jiangli Zhou wrote: > Please review this PR that problemlist's following gtests on static-jdk. These test binaries dynamically link with JDK/VM native libraries and fail on static-jdk as the runtime cannot find the required shared libraries. > > gtest/GTestWrapper > gtest/LargePageGtests#use-large-pages > gtest/LargePageGtests#use-large-pages-1G > gtest/LockStackGtests > gtest/MetaspaceGtests#no-ccs > gtest/NMTGtests#nmt-detail > gtest/NMTGtests#nmt-off > gtest/NMTGtests#nmt-summary This pull request has now been integrated. Changeset: bed5114e Author: Jiangli Zhou URL: https://git.openjdk.org/jdk/commit/bed5114e3a061d13bbc2031334d73f4527309f90 Stats: 11 lines in 1 file changed: 11 ins; 0 del; 0 mod 8356209: Problemlist failed gtests on static-jdk Reviewed-by: alanb ------------- PR: https://git.openjdk.org/jdk/pull/25050 From zgu at openjdk.org Tue May 6 19:53:31 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Tue, 6 May 2025 19:53:31 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> Message-ID: On Mon, 5 May 2025 14:22:07 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP draft: CPU Time based profiling for JFR]. >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). >> >> A version based on the cooperative sampling JEP can be found [here](https://github.com/parttimenerd/jdk/tree/parttimenerd_cooperative_cpu_time_sampler). > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Simplify local trace stack src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 331: > 329: u4 state = Atomic::load_acquire(&e->_state); > 330: if (state == state_empty(tail)) { > 331: if (Atomic::cmpxchg(&_tail, tail, tail + 1, memory_order_seq_cst) == tail) { I think it still has race. For example, _tail = 0, _capacity = 2 T1 enqueue: claim position 0, _tail = 1 after CAS T2 enqueue: claim position 1, _tail = 2 after CAS T2 enqueue: store value at position 1 T2 enqueue: claim position 0, _tail = 3 after CAS T2 enqueue: store value at position 0 T1 enqueue: store value at position 0 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2076150740 From jbechberger at openjdk.org Tue May 6 20:30:34 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 6 May 2025 20:30:34 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> Message-ID: On Tue, 6 May 2025 19:50:19 GMT, Zhengyu Gu wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplify local trace stack > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 331: > >> 329: u4 state = Atomic::load_acquire(&e->_state); >> 330: if (state == state_empty(tail)) { >> 331: if (Atomic::cmpxchg(&_tail, tail, tail + 1, memory_order_seq_cst) == tail) { > > I think it still has race. > > For example, _tail = 0, _capacity = 2 and queue is empty > T1 enqueue: claim position 0, _tail = 1 after CAS > T2 enqueue: claim position 1, _tail = 2 after CAS > T2 enqueue: store value at position 1 > T2 enqueue: claim position 0, _tail = 3 after CAS > T2 enqueue: store value at position 0 > T1 enqueue: store value at position 0 Could you help with solving this issue? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2076229576 From zgu at openjdk.org Tue May 6 20:53:25 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Tue, 6 May 2025 20:53:25 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> Message-ID: <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> On Tue, 6 May 2025 20:27:28 GMT, Johannes Bechberger wrote: > Could you help with solving this issue? I can try. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2076259736 From mdoerr at openjdk.org Tue May 6 21:16:17 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 6 May 2025 21:16:17 GMT Subject: RFR: 8352251: Implement Cooperative JFR Sampling [v16] In-Reply-To: References: <2FEvWJYZrD5yRsmTCqrgR9Lit84szuFJxqwdpjghVog=.7deabc88-039f-423e-a4bd-e36399870273@github.com> Message-ID: <27ga1Z__QxsqWxH7c0G7UfqiaPYJAP17thdyn_Mj6j8=.374f6033-4a81-4f32-a9b1-ebf9e72c3fbb@github.com> On Mon, 5 May 2025 08:50:35 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338). >> >> Implementations in this change set are provided and have been tested on the following platforms: >> >> - windows-x64 >> - windows-x64-debug >> - linux-x64 >> - linux-x64-debug >> - macosx-x64 >> - macosx-x64-debug >> - linux-aarch64 >> - linux-aarch64-debug >> - macosx-aarch64 >> - macosx-aarch64-debug >> >> Testing: tier1-6, jdk_jfr, stress testing. >> >> Platform porters note: >> Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes: >> >> - src/hotspot/cpu/x86/frame_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.cpp >> - src/hotspot/cpu/x86/interp_masm_x86.hpp >> - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp >> - src/hotspot/cpu/x86/macroAssembler_x86.cpp >> - src/hotspot/cpu/x86/macroAssembler_x86.hpp >> - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp >> - src/hotspot/cpu/x86/templateTable_x86.cpp >> - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: > > - Merge branch 'master' into 8352251 > - Configuration and test for jdk.SafepointLatency event > - include guards > - push back pd constants into pd code > - Attempt to build Windows-AARCH64 > - No invariants for sender_for_interpreter_frame > - zero > - Merge branch 'master' into 8352251 > - Refine SamplingLatency event description > - Update default.jfc > - ... and 9 more: https://git.openjdk.org/jdk/compare/8511220f...e448090e Ok, seems like the issue is that you still need some fields of the old top frame's interpreter state. It is still usable on PPC64 after the top frame has been popped off because the ABI allows using some space below the SP which will still contain the required state fields. Seems like other platforms don't guarantee to preserve space below SP (could probably get overwritten by a signal handler). I think your version is not nicely implementable on platforms which don't have an FP register, but I'll take a look and think about it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2856002242 From duke at openjdk.org Tue May 6 21:45:34 2025 From: duke at openjdk.org (Mohamed Issa) Date: Tue, 6 May 2025 21:45:34 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3] In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: > The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. > > The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version. > > For performance data collected with the built in **cbrt** micro-benchmark, see the table below. Each result is the mean of 8 individual runs. Overall, the intrinsic provides a performance uplift of 37%. > > | Benchmark | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup | > | :----------------: | :----------------------------------: | :----------------------------------: | :---------: | > | MathBench.cbrt | 152465 | 208537 | 1.37x | > > Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes. Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: Add new set of cbrt micro-benchmarks ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24470/files - new: https://git.openjdk.org/jdk/pull/24470/files/3212c669..57412f0d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=01-02 Stats: 148 lines in 1 file changed: 148 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24470.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470 PR: https://git.openjdk.org/jdk/pull/24470 From sspitsyn at openjdk.org Tue May 6 21:46:21 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 21:46:21 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v9] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 17:04:58 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > reivew: fix more trailing spaces Thank you for review, Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2856131702 From iveresov at openjdk.org Tue May 6 21:50:34 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 6 May 2025 21:50:34 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v14] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/7d22a42a..11e3c398 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=12-13 Stats: 36 lines in 2 files changed: 0 ins; 35 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From iveresov at openjdk.org Tue May 6 21:50:36 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 6 May 2025 21:50:36 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v13] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 18:48:03 GMT, Chris Plummer wrote: >> Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 43 commits: >> >> - Merge branch 'master' into pp2 >> - Fix compile >> - Fix additional issues >> - Make sure command line flags that affect MDO layout are consistent >> - Fix semantics change from the previous commit >> - Port 8355915: [leyden] Crash in MDO clearing the unloaded array type >> - Fix flag behavior >> - Fix log tags >> - Remove the proxy class counter >> - Address review comments part 2 >> - ... and 33 more: https://git.openjdk.org/jdk/compare/e09d2e27...7d22a42a > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/memory/FileMapInfo.java line 129: > >> 127: metadataTypeArray[5] = db.lookupType("InstanceStackChunkKlass"); >> 128: metadataTypeArray[6] = db.lookupType("Method"); >> 129: metadataTypeArray[9] = db.lookupType("MethodData"); > > It looks like MethodData inheriting from Metadata is not a new change, but has always been the case. I'm surprised this didn't cause any test failures before your changes. Did you end up with test failures after your changes? Honestly I don't remember, I think @iklam did these changes. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 154: > >> 152: if (!VM.getVM().isCore()) { >> 153: virtualConstructor.addMapping("CompilerThread", CompilerThread.class); >> 154: virtualConstructor.addMapping("TrainingReplayThread", TrainingReplayThread.class); > > The new SA TrainingReplayThread class is not needed since it only overrides isHiddenFromExternalView() to return true. You can instead use HiddenJavaThread.class here. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2076373507 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2076369998 From sspitsyn at openjdk.org Tue May 6 22:12:24 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 May 2025 22:12:24 GMT Subject: Integrated: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out In-Reply-To: References: Message-ID: On Thu, 27 Mar 2025 01:10:54 GMT, Serguei Spitsyn wrote: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. This pull request has now been integrated. Changeset: 9a23f721 Author: Patricio Chilano Mateo Committer: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/9a23f721c7bcbfdb2fcf5b2bd145d6967e000dc4 Stats: 204 lines in 13 files changed: 82 ins; 57 del; 65 mod 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out Reviewed-by: lmesnik, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/24269 From iklam at openjdk.org Tue May 6 23:05:13 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 6 May 2025 23:05:13 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache In-Reply-To: References: Message-ID: <2hoX2cMtO_T27H7LC2oJBIBHf8RC_2JPj7fB0JzvUmE=.83b2fdfa-97cd-4ed7-b1fc-6889d78f84f2@github.com> On Tue, 6 May 2025 17:08:36 GMT, Aleksey Shipilev wrote: >> When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. >> >> However, we have found two cases when the above scheme doesn't work. Please see the new test cases. >> >> The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. > > src/hotspot/share/cds/heapShared.cpp line 609: > >> 607: >> 608: void HeapShared::archive_strings() { >> 609: oop shared_strings_array = StringTable::init_shared_strings_array(); > > I see the old comment here that we always succeed, because `StringTable::init_shared_table` does not create any large arrays. Is this still true? I see this in `StringTable::allocate_shared_strings_array`: > > > if (ArchiveHeapWriter::is_too_large_to_archive(secondary_array_size)) { > // This can only happen if you have an extremely large number of classes that > // refer to more than 16384 * 16384 = 26M interned strings! Not a practical concern > // but bail out for safety. > log_error(cds)("Too many strings to be archived: %zu", _items_count); > MetaspaceShared::unrecoverable_writing_error(); > } > > > If we archive the _entirety_ of `StringTable` now, then it is plausible we could archive > 26M Strings now? Maybe write a stress test to see that we are properly failing out of that? Can be (should be?) a follow-up. Yes, the object returned by `StringTable::init_shared_strings_array()` will never reach any object that's too large to archive. We don't allow arbitrary user code to execute when we dump the heap (either `java -Xshare:dump` or `java -XX:AOTMode=create`). So the number of interned strings will be basically be limited to how many string literals you can fit into classfiles. There will be some wastage, but this is mostly from string literals from classes that have been excluded. It should be fairly straight forward to eliminated the unreferenced interned strings -- modify AOTArtifactFinder to walk everything else except the interned string table. Afterwards, scan the interned string table and omit the strings that have not been picked up yet. This doesn't seem a big problem so we will probably do it after 25. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2076483766 From iklam at openjdk.org Tue May 6 23:11:58 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 6 May 2025 23:11:58 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: References: Message-ID: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> > When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. > > However, we have found two cases when the above scheme doesn't work. Please see the new test cases. > > The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Improved test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25026/files - new: https://git.openjdk.org/jdk/pull/25026/files/96bda5fd..c238aeaf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25026&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25026&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25026/head:pull/25026 PR: https://git.openjdk.org/jdk/pull/25026 From iklam at openjdk.org Tue May 6 23:11:58 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 6 May 2025 23:11:58 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 09:56:35 GMT, Aleksey Shipilev wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Improved test case > > test/hotspot/jtreg/runtime/cds/appcds/aotClassLinking/NonFinalStaticWithInitVal.java line 64: > >> 62: >> 63: class MyTestApp { >> 64: volatile static int x = 0; > > Seems unused? I changed the code to read from this variable to prevent any sort of (future) inlining that javac may do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2076497676 From sspitsyn at openjdk.org Wed May 7 00:28:18 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 May 2025 00:28:18 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode In-Reply-To: <8-jSItxSwhqU969dSkiClU-YDK-nUULgh1bcnS9HMdg=.6e6912e4-45fb-48ea-b62f-8d0286088283@github.com> References: <8-jSItxSwhqU969dSkiClU-YDK-nUULgh1bcnS9HMdg=.6e6912e4-45fb-48ea-b62f-8d0286088283@github.com> Message-ID: On Tue, 6 May 2025 18:29:22 GMT, Leonid Mesnik wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > src/hotspot/share/runtime/javaThread.hpp line 1177: > >> 1175: bool is_interp_only_mode() { return (_interp_only_mode != 0); } >> 1176: int get_interp_only_mode() { return _interp_only_mode; } >> 1177: int set_interp_only_mode(int val) { return _interp_only_mode = val; } > > Ther get_interp_only_mode() /set_interp_only_mode(int val) also might be eliminated and replaced by set/clear instead. Good suggestion, thanks. Updated, it is being tested now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25060#discussion_r2076576748 From sspitsyn at openjdk.org Wed May 7 00:36:34 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 May 2025 00:36:34 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: > This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: > - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. > - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. > > Testing: > - TBD: Mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: remove get_interp_only_mode(), set_interp_only_mode() and clear_interp_only_mode() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25060/files - new: https://git.openjdk.org/jdk/pull/25060/files/d1ac6b5f..c4d167c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25060&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25060&range=00-01 Stats: 12 lines in 4 files changed: 1 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25060.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25060/head:pull/25060 PR: https://git.openjdk.org/jdk/pull/25060 From zgu at openjdk.org Wed May 7 01:27:24 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 7 May 2025 01:27:24 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> Message-ID: On Tue, 6 May 2025 20:50:17 GMT, Zhengyu Gu wrote: >> Could you help with solving this issue? > >> Could you help with solving this issue? > > I can try. `JfrTraceQueue` is not used in signal handlers, maybe you can use existing `LockfreeStack` instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2076618189 From kvn at openjdk.org Wed May 7 02:08:18 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 7 May 2025 02:08:18 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v6] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 09:31:31 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Also handle UL printing My testing for v05 passed. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24984#pullrequestreview-2820134519 From cjplummer at openjdk.org Wed May 7 02:32:18 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 7 May 2025 02:32:18 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 00:36:34 GMT, Serguei Spitsyn wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove get_interp_only_mode(), set_interp_only_mode() and clear_interp_only_mode() > The interp_only_mode in a JavaThread is checked by the interpreter chunks which expect it to be an integer. This cleanup has no intention to make it a boolean. I think this should be documented. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25060#issuecomment-2856839161 From amitkumar at openjdk.org Wed May 7 04:16:17 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 7 May 2025 04:16:17 GMT Subject: RFR: 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Wed, 19 Feb 2025 09:29:03 GMT, Amit Kumar wrote: > s390 Port for [JDK-8308984](https://bugs.openjdk.org/browse/JDK-8308984). > > This PR depends on https://github.com/openjdk/jdk/pull/23660. > > > Tier1 tests with fastdebug-vm show no regression. Thanks Lutz, Richard for the approval :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23690#issuecomment-2856973385 From amitkumar at openjdk.org Wed May 7 04:16:17 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 7 May 2025 04:16:17 GMT Subject: Integrated: 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Wed, 19 Feb 2025 09:29:03 GMT, Amit Kumar wrote: > s390 Port for [JDK-8308984](https://bugs.openjdk.org/browse/JDK-8308984). > > This PR depends on https://github.com/openjdk/jdk/pull/23660. > > > Tier1 tests with fastdebug-vm show no regression. This pull request has now been integrated. Changeset: 0eb680ca Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/0eb680ca463e8df20f058d2c0a09ed7006faa353 Stats: 11 lines in 3 files changed: 8 ins; 0 del; 3 mod 8350308: [s390x] Relativize last_sp (and top_frame_sp) in interpreter frames Reviewed-by: lucy, rrich ------------- PR: https://git.openjdk.org/jdk/pull/23690 From dholmes at openjdk.org Wed May 7 05:19:17 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 7 May 2025 05:19:17 GMT Subject: RFR: 8355648: Thread.SpinAcquire()'s lock name parameter is not used In-Reply-To: References: Message-ID: On Sun, 27 Apr 2025 00:41:45 GMT, Zhengyu Gu wrote: > Please review this trivial change that removes unused lock name parameter. I'm pretty sure the point of the name was to give something to inspect whilst debugging a hang. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24901#issuecomment-2857081556 From aboldtch at openjdk.org Wed May 7 06:15:14 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 7 May 2025 06:15:14 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding [v2] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 10:21:54 GMT, Jatin Bhateja wrote: >> This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 >> >> ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] >> >> In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. >> >> This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 >> >> >> PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding As I cannot test this on APX enabled hardware, I will leave the testing and verifying that this approach works up to you. But the change looks good, and it maintains the original behaviour for none APX enabled hardware. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24919#pullrequestreview-2820461864 From jbhateja at openjdk.org Wed May 7 06:19:17 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 7 May 2025 06:19:17 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding [v2] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 10:21:54 GMT, Jatin Bhateja wrote: >> This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 >> >> ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] >> >> In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. >> >> This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 >> >> >> PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding Hi @TobiHartmann , @eme64 , can you kindly run this version through your test infra. This is an APX-specific issue. I have verified its correctness using SDE, both following tests are now passing. https://github.com/openjdk/jdk/tree/master/test/hotspot/jtreg/compiler/c2/irTests/gc ------------- PR Comment: https://git.openjdk.org/jdk/pull/24919#issuecomment-2857197887 From jbechberger at openjdk.org Wed May 7 07:04:26 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 7 May 2025 07:04:26 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> Message-ID: <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> On Wed, 7 May 2025 01:24:27 GMT, Zhengyu Gu wrote: >>> Could you help with solving this issue? >> >> I can try. > > `JfrTraceQueue` is not used inside of signal handlers, maybe you can use existing `LockfreeStack` or just go with locked implementation instead? You mean the stack? The trace queue is used in every signal handler to obtain a new trace. But with the stack I have the problem as explained before: I would need to store the elements somewhere and therefore need a wrapper class that handles this. This offers not much benefit in my opinion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2076943221 From duke at openjdk.org Wed May 7 07:40:25 2025 From: duke at openjdk.org (duke) Date: Wed, 7 May 2025 07:40:25 GMT Subject: Withdrawn: 8321529: log_on_large_pages_failure reports log_debug(gc, heap, coops) for ReservedCodeSpace failures In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 11:29:43 GMT, Stefan Karlsson wrote: > The code path that we use to reserve memory is generic and used by various paths in the JVM, but we log messages about failures to reserve large pages on the 'gc, heap, coops' tag set. This is confusing, so I propose to log this on 'os, map' instead. We already use that tag set to log memory reservation, so I think that's a decent tag set to use. > > While doing this change I also added some extra info about the area that we tried to reserve and commit. > > A couple of G1 tests had to be tweaked. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/23297 From thartmann at openjdk.org Wed May 7 07:48:16 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 7 May 2025 07:48:16 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding [v2] In-Reply-To: References: Message-ID: <1gGtDEUALoWyrLQwwRD9bo2wb55O5Lh2DTnWTXQ8Oe8=.45ef5737-2ea6-4179-a998-79d8d51aca13@github.com> On Tue, 6 May 2025 10:21:54 GMT, Jatin Bhateja wrote: >> This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 >> >> ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] >> >> In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. >> >> This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 >> >> >> PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding Sure, I'll run it through testing and report back. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24919#issuecomment-2857462391 From shade at openjdk.org Wed May 7 08:03:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 May 2025 08:03:29 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v7] In-Reply-To: References: Message-ID: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'master' into JDK-8356027-print-compilation-timings - Also handle UL printing - Only record non-empty inlining messages - Merge W+Q => Q - Revert the shared printing block - Add legend - Merge branch 'master' into JDK-8356027-print-compilation-timings - Test TestDuplicatedLateInliningOutput.java - More touchups - Fix TypeProfileFinalMethod as well - ... and 3 more: https://git.openjdk.org/jdk/compare/50895835...2095d592 ------------- Changes: https://git.openjdk.org/jdk/pull/24984/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=06 Stats: 115 lines in 9 files changed: 83 ins; 7 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From shade at openjdk.org Wed May 7 08:08:15 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 May 2025 08:08:15 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v7] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 08:03:29 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: > > - Merge branch 'master' into JDK-8356027-print-compilation-timings > - Also handle UL printing > - Only record non-empty inlining messages > - Merge W+Q => Q > - Revert the shared printing block > - Add legend > - Merge branch 'master' into JDK-8356027-print-compilation-timings > - Test TestDuplicatedLateInliningOutput.java > - More touchups > - Fix TypeProfileFinalMethod as well > - ... and 3 more: https://git.openjdk.org/jdk/compare/50895835...2095d592 Fixed a merge conflict. Lifted new UL messages to `Info` to match [JDK-8356259](https://bugs.openjdk.org/browse/JDK-8356259). Checked logs still make sense. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2857513729 From mli at openjdk.org Wed May 7 08:36:15 2025 From: mli at openjdk.org (Hamlin Li) Date: Wed, 7 May 2025 08:36:15 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> Message-ID: On Mon, 28 Apr 2025 10:34:49 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > > Before [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), when a released jdk not supportting sleef (for any reason, e.g. low gcc version, intrinsic not supported, rvv not supported, and so on) runs on machine support vector operation (e.g. on riscv, it supports rvv), it can not call into sleef, but will not fail either, it falls back to java scalar version implementation. > But after [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), it will cause an exception thrown at runtime. > > This change the behaviour of existing jdk, and it should not throw exception anyway. > > @iwanowww @RealFYang > > Thanks! Thanks, I'll check it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2857667370 From amitkumar at openjdk.org Wed May 7 08:45:56 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 7 May 2025 08:45:56 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames Message-ID: s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. // Frame slot index relative to fp #define _z_ijava_idx(_component) \ (_z_ijava_state_neg(_component) >> LogBytesPerWord) ------------- Commit messages: - Z_R0 will work - Merge branch 'master' into monitor_rel - fix sign extension - relativization for monitors Changes: https://git.openjdk.org/jdk/pull/23708/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23708&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8350398 Stats: 50 lines in 5 files changed: 34 ins; 7 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/23708.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23708/head:pull/23708 PR: https://git.openjdk.org/jdk/pull/23708 From amitkumar at openjdk.org Wed May 7 08:45:57 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 7 May 2025 08:45:57 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) src/hotspot/cpu/s390/interp_masm_s390.cpp line 689: > 687: stop("Z_fp is corrupted"); > 688: bind(ok); > 689: #endif // ASSERT Hi @RealLucy, should I remove or keep this check here ? I wouldn't harm in release build anyway, but for debug build the size of `BTB` table will increase. Look `TemplateTable_s390.cpp` changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23708#discussion_r2072838473 From jbechberger at openjdk.org Wed May 7 09:27:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 7 May 2025 09:27:33 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> Message-ID: On Wed, 7 May 2025 07:01:54 GMT, Johannes Bechberger wrote: >> `JfrTraceQueue` is not used inside of signal handlers, maybe you can use existing `LockfreeStack` or just go with locked implementation instead? > > You mean the stack? The trace queue is used in every signal handler to obtain a new trace. > But with the stack I have the problem as explained before: I would need to store the elements somewhere and therefore need a wrapper class that handles this. This offers not much benefit in my opinion. @apangin could you chime in? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2077213561 From aph at openjdk.org Wed May 7 09:28:19 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 7 May 2025 09:28:19 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: On Tue, 6 May 2025 21:45:34 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. >> >> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version. >> >> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs. >> >> | Input range(s) | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup | >> | :-------------------------------------: | :----------------------------------: | :----------------------------------: | :---------: | >> | [-2^(-1022), 2^(-1022)] | 6568 | 17678 | 2.69x | >> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932 | 200897 | 1.45x | >> >> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes. > > Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: > > Add new set of cbrt micro-benchmarks src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 62: > 60: { > 61: 0, 3220193280 > 62: }; What is this constant? Its value is 0xbff0400000000000, which is -ve bit set, bias (top bit of exponent) clear, but one of the bits in the fraction is set. So its value is -0x1.04p+0. As well as the exponent it also sets the 1 bit, just below the 5 most significant bits of the fraction. I guess this in effect rounds up the value that is added in the final rounding. Is that right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2077214995 From rkennke at openjdk.org Wed May 7 10:11:52 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 7 May 2025 10:11:52 GMT Subject: RFR: 8356329: Report compact object headers in hs_err Message-ID: We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. ------------- Commit messages: - Add test case - Add missing format string - 8356329: Report compact object headers in hs_err Changes: https://git.openjdk.org/jdk/pull/25080/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356329 Stats: 77 lines in 2 files changed: 76 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25080.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25080/head:pull/25080 PR: https://git.openjdk.org/jdk/pull/25080 From shade at openjdk.org Wed May 7 11:51:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 May 2025 11:51:04 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v8] In-Reply-To: References: Message-ID: > In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: > 1. Time spent before queuing: shows the compilation queue bottlenecks > 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load > 3. Time spent actually compiling: shows the per-method compilation costs > > We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). > > The difference from the output format we ship in Leyden: > 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. > 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. > > See the sample `-XX:+PrintCompilation` output in the comments. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Do microseconds for timings ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24984/files - new: https://git.openjdk.org/jdk/pull/24984/files/2095d592..47019e07 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24984&range=06-07 Stats: 10 lines in 3 files changed: 7 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24984/head:pull/24984 PR: https://git.openjdk.org/jdk/pull/24984 From shade at openjdk.org Wed May 7 11:51:06 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 May 2025 11:51:06 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v7] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 08:03:29 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: > > - Merge branch 'master' into JDK-8356027-print-compilation-timings > - Also handle UL printing > - Only record non-empty inlining messages > - Merge W+Q => Q > - Revert the shared printing block > - Add legend > - Merge branch 'master' into JDK-8356027-print-compilation-timings > - Test TestDuplicatedLateInliningOutput.java > - More touchups > - Fix TypeProfileFinalMethod as well > - ... and 3 more: https://git.openjdk.org/jdk/compare/50895835...2095d592 I started building the visualizer for the new output ([JDK-8356383](https://bugs.openjdk.org/browse/JDK-8356383)), and realized the millisecond timings for compiler events are a bit too coarse to be useful. With queue and compilation times in microseconds and below 1ms, the math to compute when the task was created accumulates visible errors. So I suggest we switch the first column to microseconds as well. This also makes output consistently in microseconds in all columns. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2858278528 From rkennke at openjdk.org Wed May 7 11:59:34 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 7 May 2025 11:59:34 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v2] In-Reply-To: References: Message-ID: <8cudUmWHhJqYa1gdkFa_LprekP95yXEvWN34Xp3Y470=.8371776c-0f41-49fc-9ac6-38fdadce6aca@github.com> > We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Remove unnecessary patterns ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25080/files - new: https://git.openjdk.org/jdk/pull/25080/files/75e71d85..ff2d421f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25080.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25080/head:pull/25080 PR: https://git.openjdk.org/jdk/pull/25080 From shade at openjdk.org Wed May 7 12:58:15 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 May 2025 12:58:15 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> References: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> Message-ID: <81tQFSBJ4rvDOiu_OBRXhP25Y6Q7fiYaSNYmao_RXDU=.4e9b9058-1e93-49ce-98d3-b238557fde94@github.com> On Tue, 6 May 2025 23:11:58 GMT, Ioi Lam wrote: >> When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. >> >> However, we have found two cases when the above scheme doesn't work. Please see the new test cases. >> >> The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Improved test case Looks fine to me, thanks. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25026#pullrequestreview-2821645368 From shade at openjdk.org Wed May 7 12:58:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 May 2025 12:58:17 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: <2hoX2cMtO_T27H7LC2oJBIBHf8RC_2JPj7fB0JzvUmE=.83b2fdfa-97cd-4ed7-b1fc-6889d78f84f2@github.com> References: <2hoX2cMtO_T27H7LC2oJBIBHf8RC_2JPj7fB0JzvUmE=.83b2fdfa-97cd-4ed7-b1fc-6889d78f84f2@github.com> Message-ID: On Tue, 6 May 2025 23:02:59 GMT, Ioi Lam wrote: > We don't allow arbitrary user code to execute when we dump the heap (either java -Xshare:dump or java -XX:AOTMode=create). So the number of interned strings will be basically be limited to how many string literals you can fit into classfiles Oh! This resolves my concern, thanks. I thought we end up saving the intern table that application can fill up to the brim, but that is apparently not the case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2077566367 From stuefe at openjdk.org Wed May 7 13:47:15 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 May 2025 13:47:15 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v2] In-Reply-To: <8cudUmWHhJqYa1gdkFa_LprekP95yXEvWN34Xp3Y470=.8371776c-0f41-49fc-9ac6-38fdadce6aca@github.com> References: <8cudUmWHhJqYa1gdkFa_LprekP95yXEvWN34Xp3Y470=.8371776c-0f41-49fc-9ac6-38fdadce6aca@github.com> Message-ID: On Wed, 7 May 2025 11:59:34 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary patterns Good. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25080#pullrequestreview-2821829111 From jsjolen at openjdk.org Wed May 7 13:51:28 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 7 May 2025 13:51:28 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix Message-ID: The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` ------------- Commit messages: - set_flags -> set_has_appendix Changes: https://git.openjdk.org/jdk/pull/25092/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356390 Stats: 7 lines in 1 file changed: 1 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25092/head:pull/25092 PR: https://git.openjdk.org/jdk/pull/25092 From kvn at openjdk.org Wed May 7 14:08:18 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 7 May 2025 14:08:18 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v8] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 11:51:04 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Do microseconds for timings I agree with using microseconds uniformly for this output. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24984#pullrequestreview-2821903863 From lucy at openjdk.org Wed May 7 14:24:25 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 7 May 2025 14:24:25 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) LGTM. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23708#pullrequestreview-2821957849 From gziemski at openjdk.org Wed May 7 15:15:58 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 15:15:58 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v11] In-Reply-To: References: Message-ID: <0JVuywneELFIxqS0W9y1QerS-kI99Unm2xjQ3WYh4dU=.ca2cd87d-1f94-4cce-a27e-195f525fde02@github.com> > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: fprintf -> tty->print, cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/c98123fb..05e49984 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=09-10 Stats: 73 lines in 3 files changed: 0 ins; 10 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From gziemski at openjdk.org Wed May 7 15:15:58 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 15:15:58 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:03:59 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > src/hotspot/share/nmt/mallocTracker.cpp line 170: > >> 168: // Record a malloc memory allocation >> 169: void* MallocTracker::record_malloc(void* malloc_base, size_t size, MemTag mem_tag, >> 170: const NativeCallStack& stack, void* old_base) > > Unused We are using `old`, you can see it here: void NMT_MemoryLogRecorder::record_alloc(MemTag mem_tag, size_t requested, void* ptr, const NativeCallStack *stack, void* old) { NMT_MemoryLogRecorder *recorder = NMT_MemoryLogRecorder::instance(); if (!recorder->done()) { address old_resolved_ptr = (address)old; if (old != nullptr) { if (MemTracker::enabled()) { old_resolved_ptr = (address)old - NMT_HEADER_SIZE; } } NMT_MemoryLogRecorder::_record(mem_tag, requested, (address)ptr, old_resolved_ptr, stack); } } It's how we can tell malloc vs realloc. > src/hotspot/share/nmt/memLogRecorder.hpp line 185: > >> 183: >> 184: class NMT_MemoryLogRecorder : public NMT_LogRecorder { >> 185: public: > > TODOs? There is a bunch of TODOs for Windows impl. I am hoping we can check it in without Windows support, then add that later? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077887507 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077891580 From gziemski at openjdk.org Wed May 7 15:23:20 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 15:23:20 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:06:08 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > src/hotspot/share/nmt/memLogRecorder.hpp line 174: > >> 172: #else // defined(LINUX) || defined(__APPLE__) >> 173: >> 174: class NMT_LogRecorder : public StackObj { > > What's the idea behind having two different subclasses for the log recorder? Like, why is it important that two different objects record the two sequences of events? Sorry, I am not clear on what you are asking. We have: 1. NMT_MemoryLogRecorder 2. NMT_VirtualMemoryLogRecorder With more coming (ex: NMT_ArenasLogRecorder [NMT: add Arenas to NMTBenchmark](https://bugs.openjdk.org/browse/JDK-8353855) They share some common APIs, like init(), thread names, so it made sense to me to have both of them extend NMT_LogRecorder, which implements common functionality. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077908919 From gziemski at openjdk.org Wed May 7 15:27:17 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 15:27:17 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:06:37 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > src/hotspot/share/nmt/memLogRecorder.hpp line 136: > >> 134: address stack[NMT_TrackingStackDepth]; >> 135: long int mem_tag; >> 136: long int mem_tag_split; > > Use MemTag? Why `long int`? I think I was more comfortable with explicit types, since they are the struct that gets written to the disk, I wanted to know exactly how they are going to be laid out on a disk to make it easier to parse the data while debugging. Can this stay as is? > src/hotspot/share/nmt/memLogRecorder.hpp line 139: > >> 137: size_t size; >> 138: size_t size_split; >> 139: int type; > > Why isn't this a `Type`? Same answer as above. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077914949 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077915542 From lmesnik at openjdk.org Wed May 7 15:45:25 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 7 May 2025 15:45:25 GMT Subject: Integrated: 8347004: vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java fails with CDS disabled In-Reply-To: References: Message-ID: <4zQ8ZudBToUnJymsj1nlCYji4tdh5BfOPIWNZHLew2o=.0f999186-d3d9-4e17-9b09-47d3e95dc9f7@github.com> On Mon, 5 May 2025 17:51:13 GMT, Leonid Mesnik wrote: > Test fails with OOME if CDS is disabled. It is not a regression, it just rarely executed in this mode. > The fix is just to slightly increase Metaspace. > Verified that test now pass with CDS disabled + Xcomp. (It fails with Xcomp only) This pull request has now been integrated. Changeset: c8a30c2a Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/c8a30c2aaba04c11b70a4f74ee74452250be6e59 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8347004: vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java fails with CDS disabled Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/25046 From gziemski at openjdk.org Wed May 7 15:51:34 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 15:51:34 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v12] In-Reply-To: References: Message-ID: > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: rename Type -> MemoryOperation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/05e49984..710f5c61 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=10-11 Stats: 34 lines in 2 files changed: 0 ins; 13 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From gziemski at openjdk.org Wed May 7 15:51:35 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 15:51:35 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:07:58 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > src/hotspot/share/nmt/memLogRecorder.hpp line 151: > >> 149: SPLIT_RESERVED, >> 150: TAG >> 151: }; > > Better name than `Type`, like `MemoryOperation`? No need for `ALL_CAPS` names if you don't want to, you can use `ThisTypeOfName` instead. That's a style choice you get to make, though. I like your suggestion, done. I renamed Type -> MemoryOperation, but I kind of like the constants CAPITALIZED. > src/hotspot/share/runtime/os.cpp line 739: > >> 737: // After a successful realloc(3), we account the resized block with its new size >> 738: // to NMT. >> 739: void* const new_inner_ptr = MemTracker::record_malloc(new_outer_ptr, size, mem_tag, stack, memblock); > > Unused extra argument We do use it: ``` static inline void* record_malloc(void* mem_base, size_t size, MemTag mem_tag, const NativeCallStack& stack, void* old_base = nullptr) { assert(mem_base != nullptr, "caller should handle null"); void* ptr = mem_base; if (enabled()) { ptr = MallocTracker::record_malloc(mem_base, size, mem_tag, stack, old_base); } NMT_MemoryLogRecorder::record_alloc(mem_tag, size, mem_base, &stack, old_base); return ptr; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077958015 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077966039 From gziemski at openjdk.org Wed May 7 16:10:44 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 16:10:44 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v13] In-Reply-To: References: Message-ID: > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: fix Win build break ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/710f5c61..0c79addc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=11-12 Stats: 9 lines in 2 files changed: 7 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From gziemski at openjdk.org Wed May 7 16:10:45 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 16:10:45 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:11:57 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > src/hotspot/share/nmt/memLogRecorder.cpp line 155: > >> 153: // TODO: NMT_LogRecorder::thread_name >> 154: #endif >> 155: } > > `Thread::current()->name()` `Thread::current()->name()` has different semantics. It uses names for partially constructed threads, like ``, and `Unknown thread`, and since we save the thread name the 1st time, we currently get a name and we will be stuck with it forever. On the other hand, when `pthread_getname_np()` returns a string, we get a real final value. To go with `Thread::current()->name()` I need to check for those special temp names and skip those, but I agree, it is worth to use platform independent API here. I will make a change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2077998763 From duke at openjdk.org Wed May 7 16:43:21 2025 From: duke at openjdk.org (Lutz Schmidt) Date: Wed, 7 May 2025 16:43:21 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) LGTM. ------------- Marked as reviewed by MainframeLucy at github.com (no known OpenJDK username). PR Review: https://git.openjdk.org/jdk/pull/23708#pullrequestreview-2821764629 From duke at openjdk.org Wed May 7 16:43:23 2025 From: duke at openjdk.org (Lutz Schmidt) Date: Wed, 7 May 2025 16:43:23 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Mon, 5 May 2025 04:47:56 GMT, Amit Kumar wrote: >> s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). >> >> This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. >> >> >> // Frame slot index relative to fp >> #define _z_ijava_idx(_component) \ >> (_z_ijava_state_neg(_component) >> LogBytesPerWord) > > src/hotspot/cpu/s390/interp_masm_s390.cpp line 689: > >> 687: stop("Z_fp is corrupted"); >> 688: bind(ok); >> 689: #endif // ASSERT > > Hi @RealLucy, > should I remove or keep this check here ? I wouldn't harm in release build anyway, but for debug build the size of `BTB` table will increase. Look `TemplateTable_s390.cpp` changes. Well, it's a tradeoff between space consumption and code safety. I personally would leave the check in as long as we do not run into space limitations. If you would like to keep the code more compact, and remove the #ASSERT blocks, that's ok for me too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23708#discussion_r2077636409 From lucy at openjdk.org Wed May 7 16:43:24 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 7 May 2025 16:43:24 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Wed, 7 May 2025 13:29:05 GMT, Lutz Schmidt wrote: >> src/hotspot/cpu/s390/interp_masm_s390.cpp line 689: >> >>> 687: stop("Z_fp is corrupted"); >>> 688: bind(ok); >>> 689: #endif // ASSERT >> >> Hi @RealLucy, >> should I remove or keep this check here ? I wouldn't harm in release build anyway, but for debug build the size of `BTB` table will increase. Look `TemplateTable_s390.cpp` changes. > > Well, > it's a tradeoff between space consumption and code safety. I personally would leave the check in as long as we do not run into space limitations. If you would like to keep the code more compact, and remove the #ASSERT blocks, that's ok for me too. Sorry, the above comment was made while I was signed in with my alter ego. Bottom line was: I would keep the code, but if you don't like the code bloat, remove. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23708#discussion_r2078051508 From aph at openjdk.org Wed May 7 16:57:15 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 7 May 2025 16:57:15 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Wed, 7 May 2025 16:39:41 GMT, Lutz Schmidt wrote: >> Well, >> it's a tradeoff between space consumption and code safety. I personally would leave the check in as long as we do not run into space limitations. If you would like to keep the code more compact, and remove the #ASSERT blocks, that's ok for me too. > > Sorry, > the above comment was made while I was signed in with my alter ego. Bottom line was: I would keep the code, but if you don't like the code bloat, remove. I'd also say keep it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23708#discussion_r2078075331 From cjplummer at openjdk.org Wed May 7 17:03:22 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 7 May 2025 17:03:22 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v14] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 21:50:34 GMT, Igor Veresov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments SA changes look good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24886#pullrequestreview-2822536705 From coleenp at openjdk.org Wed May 7 17:22:16 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 7 May 2025 17:22:16 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v2] In-Reply-To: <8cudUmWHhJqYa1gdkFa_LprekP95yXEvWN34Xp3Y470=.8371776c-0f41-49fc-9ac6-38fdadce6aca@github.com> References: <8cudUmWHhJqYa1gdkFa_LprekP95yXEvWN34Xp3Y470=.8371776c-0f41-49fc-9ac6-38fdadce6aca@github.com> Message-ID: On Wed, 7 May 2025 11:59:34 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary patterns I wonder if this could say compact obj headers instead of compressed class pointers? ------------- PR Review: https://git.openjdk.org/jdk/pull/25080#pullrequestreview-2822590169 From gziemski at openjdk.org Wed May 7 17:29:03 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 17:29:03 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v14] In-Reply-To: References: Message-ID: > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with two additional commits since the last revision: - fix Win build break - use Thread::current()->name() to retrieve thread's name, instead of pthread_getname_np ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/0c79addc..ab0f96e4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=12-13 Stats: 12 lines in 2 files changed: 4 ins; 3 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From gziemski at openjdk.org Wed May 7 17:34:09 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 17:34:09 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v15] In-Reply-To: References: Message-ID: > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: Johan feedback, remove unused parameter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/ab0f96e4..b1596cdf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=13-14 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From gziemski at openjdk.org Wed May 7 17:34:10 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 7 May 2025 17:34:10 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:04:05 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > src/hotspot/share/nmt/mallocTracker.hpp line 284: > >> 282: // Record malloc on specified memory block >> 283: static void* record_malloc(void* malloc_base, size_t size, MemTag mem_tag, >> 284: const NativeCallStack& stack, void* old_base = nullptr); > > Unused Thanks, good catch, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2078134672 From zgu at openjdk.org Wed May 7 17:34:27 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 7 May 2025 17:34:27 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> Message-ID: On Wed, 7 May 2025 09:24:50 GMT, Johannes Bechberger wrote: >> You mean the stack? The trace queue is used in every signal handler to obtain a new trace. >> But with the stack I have the problem as explained before: I would need to store the elements somewhere and therefore need a wrapper class that handles this. This offers not much benefit in my opinion. > > @apangin could you chime in? > You mean the stack? The trace queue is used in every signal handler to obtain a new trace. But with the stack I have the problem as explained before: I would need to store the elements somewhere and therefore need a wrapper class that handles this. This offers not much benefit in my opinion. Can you explain why it needs to be multi-thread safe inside a signal handler? What do I miss? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078136469 From ccheung at openjdk.org Wed May 7 18:30:59 2025 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 7 May 2025 18:30:59 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> References: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> Message-ID: On Tue, 6 May 2025 23:11:58 GMT, Ioi Lam wrote: >> When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. >> >> However, we have found two cases when the above scheme doesn't work. Please see the new test cases. >> >> The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Improved test case Looks good. Spotted two minor issues in a test case. test/hotspot/jtreg/runtime/cds/appcds/aotClassLinking/GeneratedInternedString.java line 39: > 37: */ > 38: > 39: import java.lang.invoke.MethodType; This seems unused. test/hotspot/jtreg/runtime/cds/appcds/aotClassLinking/GeneratedInternedString.java line 41: > 39: import java.lang.invoke.MethodType; > 40: import jdk.test.lib.cds.SimpleCDSAppTester; > 41: import jdk.test.lib.helpers.ClassFileInstaller; This seems unused. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25026#pullrequestreview-2822769475 PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2078223927 PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2078224483 From kbarrett at openjdk.org Wed May 7 18:33:08 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 7 May 2025 18:33:08 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v5] In-Reply-To: References: Message-ID: > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: - use new waitForRefProc, some tidying - Merge branch 'master' into native-reference-get - remove timeout by using waitForReferenceProcessing - make ill-timed gc in non-concurrent case less likely - fix test package use - add package decl to test - parameterized return type of native get0 - test native method - native Reference.get helper ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24315/files - new: https://git.openjdk.org/jdk/pull/24315/files/234465f4..48b7960c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=03-04 Stats: 334519 lines in 3461 files changed: 113842 ins; 207130 del; 13547 mod Patch: https://git.openjdk.org/jdk/pull/24315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24315/head:pull/24315 PR: https://git.openjdk.org/jdk/pull/24315 From coleenp at openjdk.org Wed May 7 19:02:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 7 May 2025 19:02:58 GMT Subject: RFR: 8356173: Remove ThreadCritical Message-ID: Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. Tested with tier1-4, and tier1 on all Oracle-supported OSs. ------------- Commit messages: - Rename ChunkPoolLocker - Fix comments - Add an initialization call. - Remove includes - Remove ThreadCritical - Remove ThreadCritical from NMT. Changes: https://git.openjdk.org/jdk/pull/25072/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25072&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356173 Stats: 266 lines in 23 files changed: 32 ins; 224 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25072.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25072/head:pull/25072 PR: https://git.openjdk.org/jdk/pull/25072 From rkennke at openjdk.org Wed May 7 19:04:51 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 7 May 2025 19:04:51 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v3] In-Reply-To: References: Message-ID: > We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Replace 'compressed class ptrs' with 'compact obj headers' when running with +UCOH ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25080/files - new: https://git.openjdk.org/jdk/pull/25080/files/ff2d421f..3252031c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=01-02 Stats: 45 lines in 3 files changed: 37 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/25080.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25080/head:pull/25080 PR: https://git.openjdk.org/jdk/pull/25080 From rkennke at openjdk.org Wed May 7 19:04:52 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 7 May 2025 19:04:52 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v2] In-Reply-To: References: <8cudUmWHhJqYa1gdkFa_LprekP95yXEvWN34Xp3Y470=.8371776c-0f41-49fc-9ac6-38fdadce6aca@github.com> Message-ID: On Wed, 7 May 2025 17:19:51 GMT, Coleen Phillimore wrote: > I wonder if this could say compact obj headers instead of compressed class pointers? Good idea! Done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25080#issuecomment-2859907078 From coleenp at openjdk.org Wed May 7 19:21:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 7 May 2025 19:21:53 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 18:36:43 GMT, Radim Vansa wrote: > If methods are already sorted alphabetically, it would make sense for fields, too. Yeah, you'd want to use the same mechanism. So each InstanceKlass will have an additional 64 bit pointer, and this mapping array if JVMTI is on. For methods, there's a capability that you can test `can_maintain_original_method_order` but I don't see the same capability for fields. I assume it's because fields have never previously been reordered (?) from declaration order. Maybe the compression is still a win in footprint size with this extra pointer and mapping array. It would be a savings with your class with your huge number of fields. The other reason to compress the field stream was to avoid null bytes for fields where the attributes didn't apply (init, generic signature, etc). Compressing the fields into unsigned5 and decoding them into streams was quite a complicated change but manageable because the interface to decode them is all one has write the FieldStream iterator. This is hard to review. I'm wondering how much of a problem this is in real code, other than the case with 21k fields and if there's a way to programmatically work around this case, like decompress the fields into a hashtable or something (?) It would be interesting to see some histograms of some corpus Java code (maybe put this info in the associated bug). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2859962231 From coleenp at openjdk.org Wed May 7 19:29:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 7 May 2025 19:29:52 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: <3V_WQOwuNCXdGsueCdoHb70KNE8HdkspVDPDh4rRkAQ=.13bbd49b-26ff-4e87-ab30-80650e311a20@github.com> On Mon, 5 May 2025 06:51:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Move constant to static final var Fred tells me that we already store the original field index so maybe above is moot. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2859987801 From jbechberger at openjdk.org Wed May 7 20:51:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 7 May 2025 20:51:02 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> Message-ID: On Wed, 7 May 2025 17:31:07 GMT, Zhengyu Gu wrote: >> @apangin could you chime in? > >> You mean the stack? The trace queue is used in every signal handler to obtain a new trace. But with the stack I have the problem as explained before: I would need to store the elements somewhere and therefore need a wrapper class that handles this. This offers not much benefit in my opinion. > > Can you explain why it needs to be multi-thread safe inside a signal handler? What do I miss? It's used in the signal handlers of multiple threads (per thread signals) and at safepoints of all threads in parallel. Safepoints per thread can happen in parallel. Hope this clarifies it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078469843 From lmesnik at openjdk.org Wed May 7 21:07:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 7 May 2025 21:07:55 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v3] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 19:04:51 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Replace 'compressed class ptrs' with 'compact obj headers' when running with +UCOH Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25080#pullrequestreview-2823206342 From zgu at openjdk.org Wed May 7 21:32:50 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 7 May 2025 21:32:50 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v3] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 19:04:51 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Replace 'compressed class ptrs' with 'compact obj headers' when running with +UCOH LGTM ------------- Marked as reviewed by zgu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25080#pullrequestreview-2823258335 From zgu at openjdk.org Wed May 7 21:34:03 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Wed, 7 May 2025 21:34:03 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> Message-ID: On Wed, 7 May 2025 20:47:55 GMT, Johannes Bechberger wrote: >>> You mean the stack? The trace queue is used in every signal handler to obtain a new trace. But with the stack I have the problem as explained before: I would need to store the elements somewhere and therefore need a wrapper class that handles this. This offers not much benefit in my opinion. >> >> Can you explain why it needs to be multi-thread safe inside a signal handler? What do I miss? > > It's used in the signal handlers of multiple threads (per thread signals) and at safepoints of all threads in parallel. > > Safepoints per thread can happen in parallel. Hope this clarifies it. Thank you for the explanation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078524696 From apangin at openjdk.org Wed May 7 22:54:02 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 7 May 2025 22:54:02 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> Message-ID: On Wed, 7 May 2025 21:31:22 GMT, Zhengyu Gu wrote: >> It's used in the signal handlers of multiple threads (per thread signals) and at safepoints of all threads in parallel. >> >> Safepoints per thread can happen in parallel. Hope this clarifies it. > > Thank you for the explanation. @zhengyu123 I don't see how a race you've described can happen. T2 will not reach `cmpxchg` to claim position 0, since an earlier check `if (state == state_empty(tail))` will fail in this case. Note that `state_empty(tail)` will be true for `tail == 1` but not for `tail == 3`. Or am I missing something? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078597131 From zgu at openjdk.org Thu May 8 01:28:00 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Thu, 8 May 2025 01:28:00 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> Message-ID: <7mXUuMJnifcxkd2ZB1Sc6XWkmlOSJ-84Wp-GaMduvao=.b8d77584-f745-4c0c-bcfc-fae513388664@github.com> On Wed, 7 May 2025 21:31:22 GMT, Zhengyu Gu wrote: >> It's used in the signal handlers of multiple threads (per thread signals) and at safepoints of all threads in parallel. >> >> Safepoints per thread can happen in parallel. Hope this clarifies it. > > Thank you for the explanation. > @zhengyu123 I don't see how a race you've described can happen. T2 will not reach `cmpxchg` to claim position 0, since an earlier check `if (state == state_empty(tail))` will fail in this case. Note that `state_empty(tail)` will be true for `tail == 1` but not for `tail == 3`. Or am I missing something? T1: can be scheduled out of CPU right after CAS, before ` Atomic::release_store(&e->_state, state_full(tail));`, so position 0 is still empty. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078744427 From apangin at openjdk.org Thu May 8 01:40:01 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Thu, 8 May 2025 01:40:01 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: <7mXUuMJnifcxkd2ZB1Sc6XWkmlOSJ-84Wp-GaMduvao=.b8d77584-f745-4c0c-bcfc-fae513388664@github.com> References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> <7mXUuMJnifcxkd2ZB1Sc6XWkmlOSJ-84Wp-GaMduvao=.b8d77584-f745-4c0c-bcfc-fae513388664@github.com> Message-ID: On Thu, 8 May 2025 01:25:16 GMT, Zhengyu Gu wrote: >> Thank you for the explanation. > >> @zhengyu123 I don't see how a race you've described can happen. T2 will not reach `cmpxchg` to claim position 0, since an earlier check `if (state == state_empty(tail))` will fail in this case. Note that `state_empty(tail)` will be true for `tail == 1` but not for `tail == 3`. Or am I missing something? > > T1: can be scheduled out of CPU right after CAS, before ` Atomic::release_store(&e->_state, state_full(tail));`, so position 0 is still empty. Note that "is-empty" is not a binary flag in this algorithm. Empty state is assigned to a specific position of tail. Position 0 is marked empty for tail == 0, but not for tail == 2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078752121 From zgu at openjdk.org Thu May 8 02:27:00 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Thu, 8 May 2025 02:27:00 GMT Subject: RFR: 8337789: JEP 509: JFR CPU-Time Profiling (Experimental) [v47] In-Reply-To: References: <7JuiN0s-YCYHi13J5Pw5qEw19RwCMzXVEYl-NG0TXY0=.028dc1db-295b-4e6e-a50c-82ae708b8c6d@github.com> <_Ex-_ZtwkvwTQWEL-0XRvcHpgahuPAFg8qL8txW6Mfo=.eb1b5906-8007-4142-adb4-f0f90a2c25cb@github.com> <-xKnbNHWbQHXX3G3rmQLzEF8UaRYiTK0kfj0G5HYIlI=.e2db70a8-128e-426f-80b3-8535c4c32767@github.com> <7mXUuMJnifcxkd2ZB1Sc6XWkmlOSJ-84Wp-GaMduvao=.b8d77584-f745-4c0c-bcfc-fae513388664@github.com> Message-ID: On Thu, 8 May 2025 01:37:27 GMT, Andrei Pangin wrote: >>> @zhengyu123 I don't see how a race you've described can happen. T2 will not reach `cmpxchg` to claim position 0, since an earlier check `if (state == state_empty(tail))` will fail in this case. Note that `state_empty(tail)` will be true for `tail == 1` but not for `tail == 3`. Or am I missing something? >> >> T1: can be scheduled out of CPU right after CAS, before ` Atomic::release_store(&e->_state, state_full(tail));`, so position 0 is still empty. > > Note that "is-empty" is not a binary flag in this algorithm. Empty state is assigned to a specific position of tail. > Position 0 is marked empty for tail == 0, but not for tail == 2. Ah, I missed generation encoding, my bad! sorry for the noise. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2078781938 From iklam at openjdk.org Thu May 8 04:17:37 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 8 May 2025 04:17:37 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v3] In-Reply-To: References: Message-ID: > When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. > > However, we have found two cases when the above scheme doesn't work. Please see the new test cases. > > The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into 8356125-interned-string-omitted-from-aot-cache - @calvinccheung comments - Improved test case - fixed whitespaces - Fixed obsolete comment - Do not change the order of FinalImageRecipes::apply_recipe yet .. fix this in a separate bug - Step 2: archive all strings in StringTable - Step 1: Fixing NonFinalStaticWithInitVal_Helper ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25026/files - new: https://git.openjdk.org/jdk/pull/25026/files/c238aeaf..4a345cf0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25026&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25026&range=01-02 Stats: 14331 lines in 435 files changed: 8510 ins; 3396 del; 2425 mod Patch: https://git.openjdk.org/jdk/pull/25026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25026/head:pull/25026 PR: https://git.openjdk.org/jdk/pull/25026 From iklam at openjdk.org Thu May 8 04:17:37 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 8 May 2025 04:17:37 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: References: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> Message-ID: On Wed, 7 May 2025 18:12:49 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Improved test case > > test/hotspot/jtreg/runtime/cds/appcds/aotClassLinking/GeneratedInternedString.java line 39: > >> 37: */ >> 38: >> 39: import java.lang.invoke.MethodType; > > This seems unused. Fixed. > test/hotspot/jtreg/runtime/cds/appcds/aotClassLinking/GeneratedInternedString.java line 41: > >> 39: import java.lang.invoke.MethodType; >> 40: import jdk.test.lib.cds.SimpleCDSAppTester; >> 41: import jdk.test.lib.helpers.ClassFileInstaller; > > This seems unused. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2078852475 PR Review Comment: https://git.openjdk.org/jdk/pull/25026#discussion_r2078852456 From amitkumar at openjdk.org Thu May 8 05:04:51 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 8 May 2025 05:04:51 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) Technically I got two approvals ? but still @reinrich, This is 2nd to last. Will you be able to have a quick look at this one. Thanks, ------------- PR Comment: https://git.openjdk.org/jdk/pull/23708#issuecomment-2861781440 From duke at openjdk.org Thu May 8 06:52:00 2025 From: duke at openjdk.org (Matthias Frei) Date: Thu, 8 May 2025 06:52:00 GMT Subject: RFR: 8349988: Change cgroup version detection logic to not depend on /proc/cgroups [v4] In-Reply-To: References: Message-ID: <7nP1d4RbXH3hFtS6pQbuujNhbB3_R4L7SgPQySJKAbQ=.833a1d58-9710-4c6a-bda1-010dc4bc42ec@github.com> On Tue, 1 Apr 2025 13:39:53 GMT, Severin Gehwolf wrote: >> @tstuefe @ashu-mehra Could you please help with a second review? > >> @jerboaa @fitzsim Does the current mainline code handles mixed configuration where in some controllers are v1 and others v2? For example cpu controller is mounted as v1 while memory controller as v2. If yes, does this patch continue to support such configuration? > > The current code does not allow mixed configuration for "relevant" controllers (essentially cpu and memory). That is, they ought to be v1 or v2. In the hybrid case (systemd running on unified), it's considered v1. I don't think this patch changes any of it. @jerboaa @fitzsim, is there any plan to backport this fix (to 21)? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23811#issuecomment-2861925488 From aph at openjdk.org Thu May 8 07:11:01 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 8 May 2025 07:11:01 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) Ship it! ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23708#pullrequestreview-2824050669 From jsjolen at openjdk.org Thu May 8 07:11:57 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 8 May 2025 07:11:57 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v15] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 17:34:09 GMT, Gerard Ziemski wrote: >> Please review this addition of an internal benchmark, mostly of interest to those working with NMT. >> >> This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. >> >> Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). >> >> The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. >> >> ### To use it: >> >> To record pattern of allocations of memory calls: >> >> `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` >> >> OR to record pattern of allocations of virtual memory calls: >> >> `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` >> >> This will result in the file: >> - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) >> OR >> - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) >> >> And 2 additional files: >> - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) >> - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) >> >> >> then to actually run the benchmark: >> >> NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary >> >> ### Usage: >> >> See the issue for more details and the design document. > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > Johan feedback, remove unused parameter src/hotspot/share/nmt/memLogRecorder.cpp line 93: > 91: > 92: > 93: #define NMT_HEADER_SIZE 16 Do not use `#define` for the constant, use a static const, and don't put it in as a magic number but find it out from a `sizeof` on the appropriate type. src/hotspot/share/nmt/memLogRecorder.hpp line 67: > 65: protected: > 66: volatile size_t _threads_names_size = 0; > 67: typedef struct thread_name_info { Remove the `typedef`, not necessary in C++. src/hotspot/share/nmt/memLogRecorder.hpp line 71: > 69: long int thread; > 70: } thread_name_info; > 71: thread_name_info *_threads_names = nullptr; Let the * hug the typename, not the variable. src/hotspot/share/nmt/memLogRecorder.hpp line 94: > 92: > 93: private: > 94: struct Entry { wrong indentation, and also pull out the common stuff from the two `Entry` structs we have into a superclass. src/hotspot/share/runtime/arguments.cpp line 3936: > 3934: // 2. The passed in "buflen" should be large enough to hold the null terminator. > 3935: bool Arguments::copy_expand_pid(const char* src, size_t srclen, > 3936: char* buf, size_t buflen, int pid) { What does this change do? Why is it needed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079041839 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079028654 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079029095 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079030731 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079032694 From shade at openjdk.org Thu May 8 08:09:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 8 May 2025 08:09:55 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v3] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 04:17:37 GMT, Ioi Lam wrote: >> When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. >> >> However, we have found two cases when the above scheme doesn't work. Please see the new test cases. >> >> The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into 8356125-interned-string-omitted-from-aot-cache > - @calvinccheung comments > - Improved test case > - fixed whitespaces > - Fixed obsolete comment > - Do not change the order of FinalImageRecipes::apply_recipe yet .. fix this in a separate bug > - Step 2: archive all strings in StringTable > - Step 1: Fixing NonFinalStaticWithInitVal_Helper Still good. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25026#pullrequestreview-2824205016 From sgehwolf at openjdk.org Thu May 8 08:35:01 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 8 May 2025 08:35:01 GMT Subject: RFR: 8349988: Change cgroup version detection logic to not depend on /proc/cgroups [v4] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 13:39:53 GMT, Severin Gehwolf wrote: >> @tstuefe @ashu-mehra Could you please help with a second review? > >> @jerboaa @fitzsim Does the current mainline code handles mixed configuration where in some controllers are v1 and others v2? For example cpu controller is mounted as v1 while memory controller as v2. If yes, does this patch continue to support such configuration? > > The current code does not allow mixed configuration for "relevant" controllers (essentially cpu and memory). That is, they ought to be v1 or v2. In the hybrid case (systemd running on unified), it's considered v1. I don't think this patch changes any of it. > @jerboaa @fitzsim, is there any plan to backport this fix (to 21)? Eventually yes. But this change hasn't seen a lot of real-world exposure yet. It would be good to have that before attempting a backport. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23811#issuecomment-2862216668 From stefank at openjdk.org Thu May 8 10:06:47 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 May 2025 10:06:47 GMT Subject: RFR: 8356372: JVMTI heap sampling not working properly with outside TLAB allocations Message-ID: While working on improving the TLAB sizing code for ZGC @kstefanj ran into an issue with the following tests failing: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorInterpreterObjectTest.java serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java The reason for seeing the problems now is that with the new sizing code ZGC used smaller TLABs at first, before resizing them to a proper size (to lower the waste). In the HeapMonitor tests we don't allocate enough to trigger GCs that will resize the TLABs so most of the tests will now run with small TLABs. This should not be a problem, but it turns out the current sampling code is not working properly when you get a lot of outside TLAB allocations. You get those when trying to allocate a fairly large object (~1400B) that won't fit into the TLAB, but there are still quite a bit of room in the TLAB so we don't want to throw it away and take a new one. The problem in the current code is that we keep track of when to sample with multiple variables and when getting out of TLAB allocations these get out of sync. The proposed patch is the result of a restructuring and fixing of the the code that me and @kstefanj did to fix this issue. The idea is to better account how much we have allocated in three different buckets: * Outside of TLAB allocations * Accounted TLAB allocations * The last bit of TLAB allocations that hasn't been accounted yet And then use the sum of that to compare against a *non-changing* threshold to see if it is time to take a sample. There are a few things to think about when reading this code: * The thread can allocate and retire multiple TLABs before we cross the sample threshold. * The sampling can take multiple samples in a single TLAB * Any overshooting of the sample threshold triggers only one sample and the extra bytes are ignored when checking for the next sample. There are some restructuring in the patch to confine the ThreadHeapSampler variables and code. For example: 1) Moved accounting variables out of TLAB and into the ThreadHeapSampler 2) Moved thread allocation accounting and sampling code out of the TLAB 3) Moved retiring of TLABs out of make_parseable (needed to support (2)) Some of that could be extracted into a separate PR if that's deemed worthwhile. Tested with the HeapMonitor tests various TLAB sizes. If there are anyone using these APIs it would be nice to get feedback if these changes work well for you. ------------- Commit messages: - 8356372: JVMTI heap sampling not working properly with outside TLAB allocations Changes: https://git.openjdk.org/jdk/pull/25114/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25114&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356372 Stats: 224 lines in 14 files changed: 136 ins; 42 del; 46 mod Patch: https://git.openjdk.org/jdk/pull/25114.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25114/head:pull/25114 PR: https://git.openjdk.org/jdk/pull/25114 From rkennke at openjdk.org Thu May 8 10:41:14 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 8 May 2025 10:41:14 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v4] In-Reply-To: References: Message-ID: > We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Don't check for SIGSEGV to make Windows happy ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25080/files - new: https://git.openjdk.org/jdk/pull/25080/files/3252031c..d47d2b4e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=02-03 Stats: 6 lines in 1 file changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25080.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25080/head:pull/25080 PR: https://git.openjdk.org/jdk/pull/25080 From amitkumar at openjdk.org Thu May 8 10:58:59 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 8 May 2025 10:58:59 GMT Subject: RFR: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) Thanks for the reviews, Lutz, Andrew. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23708#issuecomment-2862621865 From amitkumar at openjdk.org Thu May 8 10:59:00 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 8 May 2025 10:59:00 GMT Subject: Integrated: 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames In-Reply-To: References: Message-ID: On Thu, 20 Feb 2025 06:40:20 GMT, Amit Kumar wrote: > s390x port for [JDK-8315966](https://bugs.openjdk.org/browse/JDK-8315966). > > This PR depends on https://github.com/openjdk/jdk/pull/23660 because index calculation macro will be added by that PR, which is being used by this one. > > > // Frame slot index relative to fp > #define _z_ijava_idx(_component) \ > (_z_ijava_state_neg(_component) >> LogBytesPerWord) This pull request has now been integrated. Changeset: 5df7089c Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/5df7089c3eb2e6d7cf6634840a2a21bcaa7e3f4e Stats: 50 lines in 5 files changed: 34 ins; 7 del; 9 mod 8350398: [s390x] Relativize initial_sp/monitors in interpreter frames Reviewed-by: lucy, aph ------------- PR: https://git.openjdk.org/jdk/pull/23708 From thartmann at openjdk.org Thu May 8 12:17:57 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 8 May 2025 12:17:57 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding [v2] In-Reply-To: References: Message-ID: <7XtX737NV9bjyQWKxZK0rjNzQ1ye2IpbsuWTtI8Rh1s=.7e6bb289-50a1-45e2-906a-44348848a281@github.com> On Tue, 6 May 2025 10:21:54 GMT, Jatin Bhateja wrote: >> This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 >> >> ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] >> >> In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. >> >> This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 >> >> >> PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding All tests passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24919#issuecomment-2862849381 From coleenp at openjdk.org Thu May 8 13:08:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 8 May 2025 13:08:52 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v2] In-Reply-To: References: Message-ID: <8SRA1LqI3JjG4b8A3MvLdlkmRwsXNm3D72pvHqt6vfg=.6f0fc391-061e-48ce-8ba3-921b25ff8710@github.com> On Thu, 24 Apr 2025 12:24:10 GMT, Johan Sj?len wrote: >> Hi, >> >> I'd like to integrate this simplification of the code for this loop. >> >> We used to have: >> >> ```c++ >> if (start < 0) { >> for (int pseudo_index = -4; pseudo_index < 0; pseudo_index++) { >> if (start == pseudo_index) { >> if (start >= end || 0 > pos || pos >= buf->length()) break; >> // ... >> } >> start++; >> } >> } >> >> >> That's exactly the same as: >> >> >> int min_end = MIN2(0, end); >> while (-4 <= start && start < min_end) { >> if (pos >= buf->length()) break; >> // ... >> start++; >> } >> >> >> but the latter looks like a conventional loop. >> >> I'd consider this a basic cleanup, which is worth doing in the name of maintainability. >> >> I would have liked to change the `-4` to `-1` into actual names, but I've no clue where those come from. It doesn't seem worth it to change them if they just happen to be a kludge relying on internal details, or something like that. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove whitespace It looks like @PaulSandoz wrote this code and could help review your change. I wish -4 etc had some const names instead and both versions look the same to me. ------------- PR Review: https://git.openjdk.org/jdk/pull/24825#pullrequestreview-2825077033 From coleenp at openjdk.org Thu May 8 13:12:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 8 May 2025 13:12:52 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v4] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 10:41:14 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Don't check for SIGSEGV to make Windows happy test/jdk/java/lang/String/CompactString/MaxSizeUTF16String.java line 143: > 141: // Strings of size min+1...min+2, throw OOME > 142: // The resulting byte array would exceed implementation limits > 143: for (int count = min + 2; count < max; count++) { Is this an unrelated change? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25080#discussion_r2079688835 From rkennke at openjdk.org Thu May 8 13:19:40 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 8 May 2025 13:19:40 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v5] In-Reply-To: References: Message-ID: > We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Revert unrelated change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25080/files - new: https://git.openjdk.org/jdk/pull/25080/files/d47d2b4e..f96e08e6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25080&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25080.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25080/head:pull/25080 PR: https://git.openjdk.org/jdk/pull/25080 From rkennke at openjdk.org Thu May 8 13:19:40 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 8 May 2025 13:19:40 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v4] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 13:10:22 GMT, Coleen Phillimore wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Don't check for SIGSEGV to make Windows happy > > test/jdk/java/lang/String/CompactString/MaxSizeUTF16String.java line 143: > >> 141: // Strings of size min+1...min+2, throw OOME >> 142: // The resulting byte array would exceed implementation limits >> 143: for (int count = min + 2; count < max; count++) { > > Is this an unrelated change? Oops, yes. I've been debugging something in that test and accidentally made the change in the wrong workspace. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25080#discussion_r2079697801 From jsikstro at openjdk.org Thu May 8 13:26:54 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 8 May 2025 13:26:54 GMT Subject: RFR: 8355692: Refactor stream indentation [v2] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 12:59:49 GMT, Joel Sikstr?m wrote: >> Hello, >> >>> To make it easier to review this PR, each commit that is prefixed with "FIX:" has the command/method I've used to verify that the output of prints is not changed/broken. >> >> The goal of this PR is to refactor the usage of streamIndentor and StreamAutoIndentor and combine them into a single StreamIndentor class to make it easier to apply indentation. [JDK-8354362](https://bugs.openjdk.org/browse/JDK-8354362) introduced a partial solution for the refactor I propose, by being able to add indentation with a StreamAutoIndentor. >> >> To use indentation, you currently need to do two separate things: 1) increment indentation and 2) apply indentation when printing. The indentation level can be incremented by using one of: streamIndentor, inc() or inc(int n), and automatic indentation can be enabled by using one of: StreamAutoIndentor, indent(), cr_indent(). >> >> The new StreamIndentor has the functionality of both streamIndentor and StreamAutoIndentor, that is, enabling automatic indentation and also applying an optional amount of indentation. This means you only need StreamIndentor to make sure that indentation is incremented and applied. >> >> There are currently four (4) ways that indentation is applied in HotSpot: >> >> 1) The new method of enabling automatic indentation and applying indentation simultaneously (partially implemented already). >> Only in GC printing code via CollectedHeap. >> >> 2) Initially use a StreamAutoIndentor, then use streamIndentor to temporarily increment indentation. Indentation is automatically applied when printing. >> nmt/memReporter uses this principle, by having a StreamAutoIndentor as a member variable and applying indentation via streamIndentor when needed. >> >> 3) Use a streamIndentor and manually call indent() (or cr_indent()). >> Commonly used pattern. No need to manually apply indentation if automatically applied. >> >> 4) Increment/decrement indentation (using inc() and/or dec()) and manually call indent(). >> Only C1's CFGPrinterOutput does this. >> >> I leave the fourth case alone as it is fairly self-contained and requires a more extensive re-write, appropriate for a follow-up patch if that's what we want. >> >> When refactoring, I only found a single case where a streamIndentor couldn't be trivially replaced with the new StreamIndentor. All other streamIndentors, along with manually applied indentation using indent()/cr_indent(), are able to be replaced with the new StreamIndentor without changing the output. >> > ... > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Make sure indentation is unchanged in ArenaStats::print_on I'm planning on integrating this early next week if anyone else wants to review or comment on this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24917#issuecomment-2863056897 From duke at openjdk.org Thu May 8 13:30:42 2025 From: duke at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 8 May 2025 13:30:42 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock Message-ID: # Issue Summary This PR addresses an `assert(bb->is_reachable())` that is triggered in the code for `-XX:+VerifyStack` after a deoptimization with reason `null_assert_or_unreached0` at a `getstatic` bytecode. Following the `getstatic` is an `areturn` and then an unreachable bytecode. When the code for `VerifyStack` tries to compute an oop map for the basic block of the unreachable bytecode, the assert triggers: getstatic Field A.val:"LB"; // if class B is not loaded, C2 deopts with reason "null_assert_or_unreached0" areturn; // The following is unreachable iconst_0; This is a similar problem to [JDK-8271055](https://bugs.openjdk.org/browse/JDK-8271055) (#7331), but this particular deopt with reason `null_assert_or_unreached0` at `getstatic` of a field containing an object reference [deopts at the next bytecode](https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/opto/parse3.cpp#L176-L199). The aforementioned issue introduced a check to skip stack verification of the next bytecode in the code if the execution after the deopted bytecode does not continue at the next bytecode in the code, i.e. falls through to the next bytecode. Unfortunately, this check did not include `areturn` as a bytecode that does not fall-through: https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/runtime/deoptimization.cpp#L845-L856 # Change Summary To fix the immediate issue described above, this PR adds `areturn` to the list of bytecodes that does not fall through. However, all return bytecodes exhibit the same behavior and might be susceptible to a similar issue. Even though I was not able to reproduce the same crash with `{d,f,i,l}return` because I could not get those or the preceding bytecode to deopt, I also added them to the `falls_through()` function. For the remaining bytecodes in `falls_through()` with the exception of `athrow` I wrote a regression test. # Testing - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/14595928439) - [ ] tier1 through tier3 on Oracle supported platforms and OSs plus Oracle internal testing # Acknowledgements Special thanks to @eme64 for his hard work on reducing a reproducer that works on all platforms. ------------- Commit messages: - Mark more bytecodes as non-fallthrough - Add regression tests for non-fallthrough bytecodes Changes: https://git.openjdk.org/jdk/pull/25118/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25118&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336906 Stats: 232 lines in 5 files changed: 227 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25118.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25118/head:pull/25118 PR: https://git.openjdk.org/jdk/pull/25118 From gziemski at openjdk.org Thu May 8 14:33:57 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 8 May 2025 14:33:57 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v15] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 06:57:04 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Johan feedback, remove unused parameter > > src/hotspot/share/nmt/memLogRecorder.hpp line 67: > >> 65: protected: >> 66: volatile size_t _threads_names_size = 0; >> 67: typedef struct thread_name_info { > > Remove the `typedef`, not necessary in C++. Hmm, without it, I get a `must use 'struct' tag to refer to type 'thread_name_info' in this scope` error here: ` thread_name_info* _threads_names = nullptr; ` I could fix it by putting `struct` in front of `thread_name_info` every time I use, but I'd rather keeep: `typedef struct thread_name_info {` and not worry anymore about it. > src/hotspot/share/nmt/memLogRecorder.hpp line 71: > >> 69: long int thread; >> 70: } thread_name_info; >> 71: thread_name_info *_threads_names = nullptr; > > Let the * hug the typename, not the variable. Done. > src/hotspot/share/runtime/arguments.cpp line 3936: > >> 3934: // 2. The passed in "buflen" should be large enough to hold the null terminator. >> 3935: bool Arguments::copy_expand_pid(const char* src, size_t srclen, >> 3936: char* buf, size_t buflen, int pid) { > > What does this change do? Why is it needed? It allows us to re-use existing hotspot code by accepting pid into the resulting string, which we need for custom log file names with pid. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079838024 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079839199 PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079841987 From gziemski at openjdk.org Thu May 8 14:38:55 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 8 May 2025 14:38:55 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v15] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 06:58:43 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Johan feedback, remove unused parameter > > src/hotspot/share/nmt/memLogRecorder.hpp line 94: > >> 92: >> 93: private: >> 94: struct Entry { > > wrong indentation, and also pull out the common stuff from the two `Entry` structs we have into a superclass. Done, except the indentation. How should it look? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079850319 From gziemski at openjdk.org Thu May 8 14:47:46 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 8 May 2025 14:47:46 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v16] In-Reply-To: References: Message-ID: <6LCBk-gKZbiL_RPCwLf_tsX0Gx4xbuytOQj2ZtjqoUU=.48266fb0-9fd1-45eb-868a-686729a2d497@github.com> > Please review this addition of an internal benchmark, mostly of interest to those working with NMT. > > This benchmark allows us to record a pattern of memory allocation operations (i.e. `malloc`, `realloc` and `free`) as well as the virtual memory allocations (i.e. `VirtualMemoryTracker::add_reserved_region`, etc.) and record those into files. > > Later we can use that recording to _play back_ the pattern with different code or settings to compare the performance (i.e. memory usage as well as time). > > The goal of this benchmark is for anyone working on NMT to be able to measure and prove whether their improvement helps or regresses the performance. > > ### To use it: > > To record pattern of allocations of memory calls: > > `NMTRecordMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > OR to record pattern of allocations of virtual memory calls: > > `NMTRecordVirtualMemoryAllocations=0x7FFFFFFF ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -jar build/macosx-aarch64-server-release/images/jdk/demo/jfc/J2Ddemo/J2Ddemo.jar` > > This will result in the file: > - hs_nmt_pid22770_allocs_record.log (is the chronological record of the the desired operations) > OR > - hs_nmt_pid22770_virtual_allocs_record.log (is the chronological record of the desired operations) > > And 2 additional files: > - hs_nmt_pid22770_info_record.log (is the record of default NMT memory overhead and the NMT state) > - hs_nmt_pid22770_threads_record.log (is the record of thread names that can be retrieved later when processing) > > > then to actually run the benchmark: > > NMTBenchmarkRecordedPID=22770 ./build/macosx-aarch64-server-release/xcode/build/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary > > ### Usage: > > See the issue for more details and the design document. Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: Johan's feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23786/files - new: https://git.openjdk.org/jdk/pull/23786/files/b1596cdf..d4255b3c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23786&range=14-15 Stats: 14 lines in 2 files changed: 5 ins; 4 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/23786.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23786/head:pull/23786 PR: https://git.openjdk.org/jdk/pull/23786 From gziemski at openjdk.org Thu May 8 14:47:50 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 8 May 2025 14:47:50 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v15] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 07:07:10 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Johan feedback, remove unused parameter > > src/hotspot/share/nmt/memLogRecorder.cpp line 93: > >> 91: >> 92: >> 93: #define NMT_HEADER_SIZE 16 > > Do not use `#define` for the constant, use a static const, and don't put it in as a magic number but find it out from a `sizeof` on the appropriate type. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23786#discussion_r2079864345 From gziemski at openjdk.org Thu May 8 14:47:48 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 8 May 2025 14:47:48 GMT Subject: RFR: 8317453: NMT: Performance benchmarks are needed to measure speed and memory [v10] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 13:13:54 GMT, Johan Sj?len wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> use permit_forbidden_function for realloc > > There's commented out code, fix that by getting rid of it or converting logs to UL. > > Sometimes you use `fprintf(stderr`, sometimes you use `tty->print`, what's the difference and why not use UL? @jdksjolen Thank for the feedback, very much appreciated! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23786#issuecomment-2863318817 From epeter at openjdk.org Thu May 8 14:56:56 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 8 May 2025 14:56:56 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock In-Reply-To: References: Message-ID: <0W1NJ3CRAdeugnJTYVGVtomqYJEX5QdVEua9XPSWn5g=.d5b30054-2805-4b8c-a9f9-5b1cdbc12d2a@github.com> On Thu, 8 May 2025 13:22:55 GMT, Manuel H?ssig wrote: > # Issue Summary > > This PR addresses an `assert(bb->is_reachable())` that is triggered in the code for `-XX:+VerifyStack` after a deoptimization with reason `null_assert_or_unreached0` at a `getstatic` bytecode. Following the `getstatic` is an `areturn` and then an unreachable bytecode. When the code for `VerifyStack` tries to compute an oop map for the basic block of the unreachable bytecode, the assert triggers: > > getstatic Field A.val:"LB"; // if class B is not loaded, C2 deopts with reason "null_assert_or_unreached0" > areturn; > // The following is unreachable > iconst_0; > > > This is a similar problem to [JDK-8271055](https://bugs.openjdk.org/browse/JDK-8271055) (#7331), but this particular deopt with reason `null_assert_or_unreached0` at `getstatic` of a field containing an object reference [deopts at the next bytecode](https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/opto/parse3.cpp#L176-L199). The aforementioned issue introduced a check to skip stack verification of the next bytecode in the code if the execution after the deopted bytecode does not continue at the next bytecode in the code, i.e. falls through to the next bytecode. Unfortunately, this check did not include `areturn` as a bytecode that does not fall-through: > https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/runtime/deoptimization.cpp#L845-L856 > > # Change Summary > > To fix the immediate issue described above, this PR adds `areturn` to the list of bytecodes that does not fall through. However, all return bytecodes exhibit the same behavior and might be susceptible to a similar issue. Even though I was not able to reproduce the same crash with `{d,f,i,l}return` because I could not get those or the preceding bytecode to deopt, I also added them to the `falls_through()` function. For the remaining bytecodes in `falls_through()` with the exception of `athrow` I wrote a regression test. > > # Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/14595928439) > - [ ] tier1 through tier3 on Oracle supported platforms and OSs plus Oracle internal testing > > # Acknowledgements > Special thanks to @eme64 for his hard work on reducing a reproducer that works on all platforms. @mhaessig Thanks for looking into this! A few comments: 1) Did you go through all bytecodes we support here? I quickly scanned the wiki page, and found the (depricated but still present) `jsr` opcodes that also are essencially a goto, so do not fall-through, I think. At least it seems to me we could also have unreachable code after a `jsr`, right? And then there is a `ret` bytecode that does the symmetrical thing. Not sure if we even handle these in such a way that the bug could be reproduced, but worth a check! Can you go over all bytecodes, and make sure we are not missing any? Because this is already the second bug of this kind, would be good to fix it once and for all now ;) 2) > Even though I was not able to reproduce the same crash with {d,f,i,l}return because I could not get those or the preceding bytecode to deopt, I also added them to the falls_through() function. Hmm ok, I see. Might be worth investing just a little more time to see if we cannot get that done. Or else argue why it CANNOT be done. But then we might as well put an assert inside `falls_through` for those cases, to check that we actually never deopt like that at such an opcode, and revisit that assumption if we ever hit the assert. What do you think? 3) For the test: It's a bit of a shame to have lots of separate files. Especially because the directory name `test/hotspot/jtreg/compiler/interpreter/verifyStack/` may at some point have more tests, and then it gets a little confusing. I wonder if `A` and `B` could be nested classes in the java file? And the java and jasm file could be named very similarly, so that it is directly clear that they belong together when browsing the test files. An alternative: create a subdirectory that has a very unique name, so that we could separate things that way. ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25118#pullrequestreview-2825371630 From lkorinth at openjdk.org Thu May 8 15:02:42 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 8 May 2025 15:02:42 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor Message-ID: This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. These fixes have been created when I have plown through testcases: JDK-8352719: Add an equals sign to the modules statement JDK-8352709: Remove bad timing annotations from WhileOpTest.java JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE CODETOOLS-7903961: Make default timeout configurable Sometime in the future I will also fix: 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 for which I am awaiting: CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 *After the review I will revert the two first commits, and update the copyrights* ------------- Commit messages: - 8356171: Increase timeout for testcases as preparation for change of default timeout factor - Fix some tests that need an upgrade of JTREG --- REVERT THIS LATER - 8260555 Changes: https://git.openjdk.org/jdk/pull/25122/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25122&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356171 Stats: 556 lines in 272 files changed: 59 ins; 96 del; 401 mod Patch: https://git.openjdk.org/jdk/pull/25122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25122/head:pull/25122 PR: https://git.openjdk.org/jdk/pull/25122 From epeter at openjdk.org Thu May 8 15:04:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 8 May 2025 15:04:53 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock In-Reply-To: <0W1NJ3CRAdeugnJTYVGVtomqYJEX5QdVEua9XPSWn5g=.d5b30054-2805-4b8c-a9f9-5b1cdbc12d2a@github.com> References: <0W1NJ3CRAdeugnJTYVGVtomqYJEX5QdVEua9XPSWn5g=.d5b30054-2805-4b8c-a9f9-5b1cdbc12d2a@github.com> Message-ID: On Thu, 8 May 2025 14:54:05 GMT, Emanuel Peter wrote: > Even though I was not able to reproduce the same crash with {d,f,i,l}return because I could not get those or the preceding bytecode to deopt, I also added them to the falls_through() function. Basically, there are 2 cases: - opcodes that deopt and retry: these were already there, as far as I know, and @dean-long added them in his previous patch. So here we could only take opcodes that: deopt, retry, and do not have fall-through. - opcodes that deopt but do not retry, but skip forward to the next op, that we then have to check for fall-through. For the deopting opcode, there are 2 categories: - Those that put something on the stack, like `getstatic` that puts whatever it got on the stack. This constrains what opcode comes after. If it returns an object/null, you can only do `return` (ignore stack value) or `areturn` (return that stack value). But you cannot do `ireturn` because the value on the stack is an object, not int. - Those that put nothing on the stack. Here we would not be constrained, and could push whatever we need on the stack before that opcode. E.g. we could push an int before that opcode, and then do `ireturn`. But I'm not sure if there are any such opcodes that deopt but push nothing on the stack. Worth checking though! Hope this case distinction helps a little, I'm not sure it is particularly clear or accurate, but these are the things I would look into if I were working on this bug :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25118#issuecomment-2863377063 From lmesnik at openjdk.org Thu May 8 15:05:53 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 8 May 2025 15:05:53 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 00:36:34 GMT, Serguei Spitsyn wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove get_interp_only_mode(), set_interp_only_mode() and clear_interp_only_mode() Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25060#pullrequestreview-2825459109 From lmesnik at openjdk.org Thu May 8 15:08:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 8 May 2025 15:08:55 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v5] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 13:19:40 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Revert unrelated change Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25080#pullrequestreview-2825469001 From epeter at openjdk.org Thu May 8 15:09:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 8 May 2025 15:09:53 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock In-Reply-To: References: Message-ID: On Thu, 8 May 2025 13:22:55 GMT, Manuel H?ssig wrote: > # Issue Summary > > This PR addresses an `assert(bb->is_reachable())` that is triggered in the code for `-XX:+VerifyStack` after a deoptimization with reason `null_assert_or_unreached0` at a `getstatic` bytecode. Following the `getstatic` is an `areturn` and then an unreachable bytecode. When the code for `VerifyStack` tries to compute an oop map for the basic block of the unreachable bytecode, the assert triggers: > > getstatic Field A.val:"LB"; // if class B is not loaded, C2 deopts with reason "null_assert_or_unreached0" > areturn; > // The following is unreachable > iconst_0; > > > This is a similar problem to [JDK-8271055](https://bugs.openjdk.org/browse/JDK-8271055) (#7331), but this particular deopt with reason `null_assert_or_unreached0` at `getstatic` of a field containing an object reference [deopts at the next bytecode](https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/opto/parse3.cpp#L176-L199). The aforementioned issue introduced a check to skip stack verification of the next bytecode in the code if the execution after the deopted bytecode does not continue at the next bytecode in the code, i.e. falls through to the next bytecode. Unfortunately, this check did not include `areturn` as a bytecode that does not fall-through: > https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/runtime/deoptimization.cpp#L845-L856 > > # Change Summary > > To fix the immediate issue described above, this PR adds `areturn` to the list of bytecodes that does not fall through. However, all return bytecodes exhibit the same behavior and might be susceptible to a similar issue. Even though I was not able to reproduce the same crash with `{d,f,i,l}return` because I could not get those or the preceding bytecode to deopt, I also added them to the `falls_through()` function. For the remaining bytecodes in `falls_through()` with the exception of `athrow` I wrote a regression test. > > # Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/14595928439) > - [ ] tier1 through tier3 on Oracle supported platforms and OSs plus Oracle internal testing > > # Acknowledgements > Special thanks to @eme64 for his hard work on reducing a reproducer that works on all platforms. Quickly scanning, I see these that also may or may not have a fall-through: `lookupswitch`, `tableswitch` ------------- PR Comment: https://git.openjdk.org/jdk/pull/25118#issuecomment-2863398487 From stuefe at openjdk.org Thu May 8 15:45:52 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 8 May 2025 15:45:52 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v5] In-Reply-To: References: Message-ID: <_6VGG_Cjgn0evxwQLm7Euhu4g1udr-7A0ukdgP-z-BE=.ce00eab2-ec96-4108-ae19-9333884b03ee@github.com> On Thu, 8 May 2025 13:19:40 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Revert unrelated change +1 ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25080#pullrequestreview-2825590861 From stefank at openjdk.org Thu May 8 16:09:00 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 May 2025 16:09:00 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: <2nBGcIjZC03ee74o34IXFgtoEVTAkQV-xXEC28_oFbI=.da57d5a4-4546-4566-aa79-cacce01562d7@github.com> On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* Thanks for tackling this! I look forward to the day when you change TIMEOUT_FACTOR to be 1 by default. I think that will reduce confusion. I made a cursory look through some GC files and I think that looked good. doc/testing.md line 385: > 383: (`-timeoutFactor`). Also, some test cases that programmatically wait a > 384: certain amount of time will apply this factor. If we run in > 385: interpreted mode (`-Xcomp`), [RunTest.gmk](../make/RunTests.gmk) Maybe Suggestion: interpreted mode (`-Xint`), [RunTest.gmk](../make/RunTests.gmk) ------------- PR Review: https://git.openjdk.org/jdk/pull/25122#pullrequestreview-2825661242 PR Review Comment: https://git.openjdk.org/jdk/pull/25122#discussion_r2080028720 From rkennke at openjdk.org Thu May 8 16:10:02 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 8 May 2025 16:10:02 GMT Subject: RFR: 8356329: Report compact object headers in hs_err [v5] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 13:19:40 GMT, Roman Kennke wrote: >> We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Revert unrelated change Thanks all! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25080#issuecomment-2863580115 From rkennke at openjdk.org Thu May 8 16:10:03 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 8 May 2025 16:10:03 GMT Subject: Integrated: 8356329: Report compact object headers in hs_err In-Reply-To: References: Message-ID: On Wed, 7 May 2025 06:52:28 GMT, Roman Kennke wrote: > We should report when UseCompactObjectHeaders is enabled in the hs_err file, just like we do for UseCompressedOops or UseCompressedClassPointers, for improved diagnostics. This pull request has now been integrated. Changeset: 6b1e88a9 Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/6b1e88a946c5aa5ab8c1b320ebdfdf595c469855 Stats: 105 lines in 2 files changed: 104 ins; 0 del; 1 mod 8356329: Report compact object headers in hs_err Reviewed-by: stuefe, lmesnik, zgu ------------- PR: https://git.openjdk.org/jdk/pull/25080 From dfuchs at openjdk.org Thu May 8 16:28:53 2025 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Thu, 8 May 2025 16:28:53 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* @lkorinth have you run all the tiers where the old default timeout factor of 4 applied? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2863633579 From lkorinth at openjdk.org Thu May 8 16:45:56 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 8 May 2025 16:45:56 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* Before this version, I had run tiers 1-8, with 11 fails. ** DONE 01 serviceability/jvmti/vthread/SuspendResume2/SuspendResume2.java#debug 700 ** DONE 02 jdk/internal/platform/docker/TestUseContainerSupport.java OTHER ** DONE 03 tools/javac/util/IteratorsTest.java 480 ** DONE 04 java/math/BigInteger/LargeValueExceptions.java 480 ** DONE 05 vmTestbase/gc/gctests/WeakReference/weak004/weak004.java OTHER ** DONE 06 compiler/loopstripmining/CheckLoopStripMining.java OTHER ** DONE 07 sun/security/tools/keytool/fakecacerts/TrustedCert.java 480 ** DONE 08 jdk/internal/platform/docker/TestUseContainerSupport.java OTHER ** DONE 09 containers/docker/TestJFRNetworkEvents.java OTHER ** DONE 10 java/foreign/TestMismatch.java 480 ** DONE 11 linux-riscv64-open-cmp-baseline-linux-x64-build-796 OTHER Six of those seems not related to my changes (marked `OTHER`), five I have updated for this run with new timeout. I have fixed the timeouts and rebased (I had one conflict), and I am now again running tier1-8. It will take time, and it looks like I will have more (unrelated) fails this time. After I revert the two first commits and go back to a timeout factor of 4, I will run tier 1-8 again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2863670496 PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2863675379 From lkorinth at openjdk.org Thu May 8 17:06:02 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 8 May 2025 17:06:02 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: <2nBGcIjZC03ee74o34IXFgtoEVTAkQV-xXEC28_oFbI=.da57d5a4-4546-4566-aa79-cacce01562d7@github.com> References: <2nBGcIjZC03ee74o34IXFgtoEVTAkQV-xXEC28_oFbI=.da57d5a4-4546-4566-aa79-cacce01562d7@github.com> Message-ID: On Thu, 8 May 2025 16:04:53 GMT, Stefan Karlsson wrote: >> This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). >> >> The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. >> >> In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). >> >> My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. >> >> These fixes have been created when I have plown through testcases: >> JDK-8352719: Add an equals sign to the modules statement >> JDK-8352709: Remove bad timing annotations from WhileOpTest.java >> JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test >> CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE >> CODETOOLS-7903961: Make default timeout configurable >> >> Sometime in the future I will also fix: >> 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 >> >> for which I am awaiting: >> CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 >> >> *After the review I will revert the two first commits, and update the copyrights* > > doc/testing.md line 385: > >> 383: (`-timeoutFactor`). Also, some test cases that programmatically wait a >> 384: certain amount of time will apply this factor. If we run in >> 385: interpreted mode (`-Xcomp`), [RunTest.gmk](../make/RunTests.gmk) > > Maybe > Suggestion: > > interpreted mode (`-Xint`), [RunTest.gmk](../make/RunTests.gmk) Thanks for catching this fault of mine. I will update the text and change `interpreted mode`, as it is really `-Xcomp` we are looking at in the RunTest.gmk. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25122#discussion_r2080117016 From dfuchs at openjdk.org Thu May 8 17:43:54 2025 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Thu, 8 May 2025 17:43:54 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* Thank you. I have imported your PR locally and running some HTTP client tests in the CI. Tests have not finished running - but I already see one intermittent failure: `java/net/httpclient/RedirectTimeoutTest.java` is timing out intermittently on windows. It would be good to flush out any such intermittent failures before this PR is integrated. This might require multiple runs before we can get confidence. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2863805648 From coleenp at openjdk.org Thu May 8 17:51:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 8 May 2025 17:51:53 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v2] In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:24:10 GMT, Johan Sj?len wrote: >> Hi, >> >> I'd like to integrate this simplification of the code for this loop. >> >> We used to have: >> >> ```c++ >> if (start < 0) { >> for (int pseudo_index = -4; pseudo_index < 0; pseudo_index++) { >> if (start == pseudo_index) { >> if (start >= end || 0 > pos || pos >= buf->length()) break; >> // ... >> } >> start++; >> } >> } >> >> >> That's exactly the same as: >> >> >> int min_end = MIN2(0, end); >> while (-4 <= start && start < min_end) { >> if (pos >= buf->length()) break; >> // ... >> start++; >> } >> >> >> but the latter looks like a conventional loop. >> >> I'd consider this a basic cleanup, which is worth doing in the name of maintainability. >> >> I would have liked to change the `-4` to `-1` into actual names, but I've no clue where those come from. It doesn't seem worth it to change them if they just happen to be a kludge relying on internal details, or something like that. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove whitespace Okay, now I see it in context. Looks good to me. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24825#pullrequestreview-2825918319 From iklam at openjdk.org Thu May 8 17:57:58 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 8 May 2025 17:57:58 GMT Subject: RFR: 8356125: Interned strings are omitted from AOT cache [v2] In-Reply-To: References: <_eowrULNjIHYoi4xiTHyowVM-iuxJBIaBNqVLJaXRJI=.2f407b58-8076-4c6f-8fd2-c797266f1ddb@github.com> Message-ID: On Wed, 7 May 2025 18:14:43 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Improved test case > > Looks good. Spotted two minor issues in a test case. Thanks @calvinccheung and @shipilev for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/25026#issuecomment-2863833677 From iklam at openjdk.org Thu May 8 17:58:00 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 8 May 2025 17:58:00 GMT Subject: Integrated: 8356125: Interned strings are omitted from AOT cache In-Reply-To: References: Message-ID: On Mon, 5 May 2025 00:10:38 GMT, Ioi Lam wrote: > When dumping the interned string table in the AOT cache, we try to include only the strings that are inside ConstantPool::reference_array(). The hope is to limit the size of the AOT cache by omitting interned strings that are not used by objects inside the AOT cache. > > However, we have found two cases when the above scheme doesn't work. Please see the new test cases. > > The fix is to always include all interned strings managed by stringTable.cpp. We might try to omit the truly unused strings in a separate RFE. This pull request has now been integrated. Changeset: 4379e2d2 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/4379e2d26bd445d3f303a5937d1e335885be9216 Stats: 328 lines in 12 files changed: 208 ins; 88 del; 32 mod 8356125: Interned strings are omitted from AOT cache Reviewed-by: shade, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/25026 From liach at openjdk.org Thu May 8 17:59:28 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 8 May 2025 17:59:28 GMT Subject: RFR: 8356548: Avoid using ASM to parse latest class files in tests Message-ID: For early eval; test by changing the ClassReader max accepted version of test ASM to 24 instead of 25 ------------- Commit messages: - Merge branch 'master' of https://github.com/openjdk/jdk into fix/asm-test-upgrade - 8356548: Avoid using ASM to parse latest class files in tests Changes: https://git.openjdk.org/jdk/pull/25124/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356548 Stats: 291 lines in 8 files changed: 99 ins; 136 del; 56 mod Patch: https://git.openjdk.org/jdk/pull/25124.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25124/head:pull/25124 PR: https://git.openjdk.org/jdk/pull/25124 From sspitsyn at openjdk.org Thu May 8 18:18:52 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 May 2025 18:18:52 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 02:29:44 GMT, Chris Plummer wrote: >> The interp_only_mode in a JavaThread is checked by the interpreter chunks which expect it to be an integer. This cleanup has no intention to make it a boolean. > I think this should be documented. Do you mean we need a comment somewhere to explain this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25060#issuecomment-2863884022 From sspitsyn at openjdk.org Thu May 8 18:18:53 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 May 2025 18:18:53 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 00:36:34 GMT, Serguei Spitsyn wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove get_interp_only_mode(), set_interp_only_mode() and clear_interp_only_mode() Thank you for review, Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25060#issuecomment-2863885439 From cjplummer at openjdk.org Thu May 8 18:26:51 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 8 May 2025 18:26:51 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 18:15:29 GMT, Serguei Spitsyn wrote: > > > The interp_only_mode in a JavaThread is checked by the interpreter chunks which expect it to be an integer. This cleanup has no intention to make it a boolean. > > > I think this should be documented. > > Do you mean we need a comment somewhere to explain this? Yes. Maybe with the declaration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25060#issuecomment-2863901625 From coleenp at openjdk.org Thu May 8 18:35:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 8 May 2025 18:35:54 GMT Subject: RFR: 8356548: Avoid using ASM to parse latest class files in tests In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:57:05 GMT, Chen Liang wrote: > For early eval; test by changing the ClassReader max accepted version of test ASM to 24 instead of 25 test/hotspot/jtreg/compiler/calls/common/InvokeDynamicPatcher.java line 76: > 74: throw new Error("TESTBUG: Can't get code source" + ex, ex); > 75: } > 76: try (FileInputStream fis = new FileInputStream(filePath.toFile())) { Don't you have to delete lines 149-155 also? test/hotspot/jtreg/runtime/MirrorFrame/Test8003720.java line 30: > 28: * @library /testlibrary/asm > 29: * @modules java.base/jdk.internal.misc > 30: * @compile -XDignore.symbol.file -source 21 -target 21 Victim.java Why is this necessary? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25124#discussion_r2080231780 PR Review Comment: https://git.openjdk.org/jdk/pull/25124#discussion_r2080233032 From liach at openjdk.org Thu May 8 18:35:55 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 8 May 2025 18:35:55 GMT Subject: RFR: 8356548: Avoid using ASM to parse latest class files in tests In-Reply-To: References: Message-ID: On Thu, 8 May 2025 18:23:58 GMT, Coleen Phillimore wrote: >> For early eval; test by changing the ClassReader max accepted version of test ASM to 24 instead of 25 > > test/hotspot/jtreg/compiler/calls/common/InvokeDynamicPatcher.java line 76: > >> 74: throw new Error("TESTBUG: Can't get code source" + ex, ex); >> 75: } >> 76: try (FileInputStream fis = new FileInputStream(filePath.toFile())) { > > Don't you have to delete lines 149-155 also? This path is used to dump the destination bytes. I tried to use `test.classes` to dump the output bytes but to no avail - seems libraries have some custom mechanisms, that I can neither find the location of their classes, nor can I easily pass javac flags for libraries via jtreg. > test/hotspot/jtreg/runtime/MirrorFrame/Test8003720.java line 30: > >> 28: * @library /testlibrary/asm >> 29: * @modules java.base/jdk.internal.misc >> 30: * @compile -XDignore.symbol.file -source 21 -target 21 Victim.java > > Why is this necessary? The Asmator uses objectweb ASM to transform Victim, which means ASM needs to parse the javac output Victim.class. If this class wishes to be backported in the future, this flag ensures the output class file is of a constant version that ASM on older versions can consistently parse. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25124#discussion_r2080260173 PR Review Comment: https://git.openjdk.org/jdk/pull/25124#discussion_r2080256577 From psandoz at openjdk.org Thu May 8 19:10:56 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Thu, 8 May 2025 19:10:56 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v2] In-Reply-To: <8SRA1LqI3JjG4b8A3MvLdlkmRwsXNm3D72pvHqt6vfg=.6f0fc391-061e-48ce-8ba3-921b25ff8710@github.com> References: <8SRA1LqI3JjG4b8A3MvLdlkmRwsXNm3D72pvHqt6vfg=.6f0fc391-061e-48ce-8ba3-921b25ff8710@github.com> Message-ID: On Thu, 8 May 2025 13:06:31 GMT, Coleen Phillimore wrote: > It looks like @PaulSandoz wrote this code and could help review your change. I wish -4 etc had some const names instead and both versions look the same to me. I was the one who committed this code, but i don't recall writing it, and was likely written by another contributor. This is arguably now dead code, likely written at the time for generality in the expectation it might be used but eventually was not. I don't see any calls to this method from Java that passes in a negative start argument. This method supports accessing the prefix of known bootstrap arguments (bsm, name, type) + size information for the additional arguments, but these arguments already available in Java and passed in via up call linkage (see [MethodHandleNatives.linkDynamicConstant](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/invoke/MethodHandleNatives.java#L308)). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24825#issuecomment-2864023699 From prr at openjdk.org Thu May 8 20:02:52 2025 From: prr at openjdk.org (Phil Race) Date: Thu, 8 May 2025 20:02:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* test/jdk/java/awt/font/NumericShaper/MTTest.java - * @run main/timeout=300/othervm MTTest + * @run main/timeout=1200/othervm MTTest I'm puzzling over why you saw this test fail with timeout = 300 .. or perhaps you saw it fail with 0.7 ? Which would amount to 210 seconds .. that might just be enough to cause it to fail because if you look at the whole test you'll see it wants the core loops of the test to run for 180 seconds. https://openjdk.github.io/cr/?repo=jdk&pr=25122&range=00#new-144-test/jdk/java/awt/font/NumericShaper/MTTest.java So 300 was fine, and 1200 isn't needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2864133534 From vlivanov at openjdk.org Thu May 8 21:22:52 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 8 May 2025 21:22:52 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes [v2] In-Reply-To: References: Message-ID: <0oWJ3zW0ykhbon1An-cbaib2XDdv-qBJS23f0CZkxcg=.faf3d22c-c6a1-4dd3-802c-40fde0a07846@github.com> On Mon, 5 May 2025 23:50:31 GMT, Ioi Lam wrote: >> This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). >> >> AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) >> >> In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. >> >> I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Comments from @liach and @ExE-Boss Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24956#pullrequestreview-2826395240 From vlivanov at openjdk.org Thu May 8 21:51:52 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 8 May 2025 21:51:52 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v8] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 11:51:04 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Do microseconds for timings Frankly speaking, I'm on the fence about this particular change. The information is very useful, but whether it has to be unconditionally printed as part of `-XX:+PrintCompilation` output or not is an open question for me. It more than doubles the amount of data printed and then if it is primarily intended to be consumed by tools, then `tty` is not the best way to stream the data out of the JVM (e.g., `ttyLocker` is not enough to completely eliminate VM output interleaving). If you don't want to mess with `-XX:+LogCompilation`, unified logging provides a more convenient and configurable way to stream line-oriented textual data. ------------- PR Review: https://git.openjdk.org/jdk/pull/24984#pullrequestreview-2826441190 From sspitsyn at openjdk.org Thu May 8 22:07:40 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 May 2025 22:07:40 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v3] In-Reply-To: References: Message-ID: > This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: > - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. > - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. > > Testing: > - TBD: Mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: added comment with clarification ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25060/files - new: https://git.openjdk.org/jdk/pull/25060/files/c4d167c4..b61ff511 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25060&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25060&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25060.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25060/head:pull/25060 PR: https://git.openjdk.org/jdk/pull/25060 From sspitsyn at openjdk.org Thu May 8 22:07:40 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 May 2025 22:07:40 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v2] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 18:24:01 GMT, Chris Plummer wrote: > Yes. Maybe with the declaration. Thanks. Added a comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25060#issuecomment-2864537210 From sviswanathan at openjdk.org Thu May 8 22:20:52 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 8 May 2025 22:20:52 GMT Subject: RFR: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding [v2] In-Reply-To: References: Message-ID: On Tue, 6 May 2025 10:21:54 GMT, Jatin Bhateja wrote: >> This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 >> >> ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] >> >> In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. >> >> This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 >> >> >> PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding Looks good to me as well. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24919#pullrequestreview-2826479403 From cjplummer at openjdk.org Thu May 8 22:59:52 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 8 May 2025 22:59:52 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v3] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 22:07:40 GMT, Serguei Spitsyn wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added comment with clarification Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25060#pullrequestreview-2826558437 From lmesnik at openjdk.org Thu May 8 23:12:52 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 8 May 2025 23:12:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: <2nBGcIjZC03ee74o34IXFgtoEVTAkQV-xXEC28_oFbI=.da57d5a4-4546-4566-aa79-cacce01562d7@github.com> Message-ID: On Thu, 8 May 2025 17:03:03 GMT, Leo Korinth wrote: >> doc/testing.md line 385: >> >>> 383: (`-timeoutFactor`). Also, some test cases that programmatically wait a >>> 384: certain amount of time will apply this factor. If we run in >>> 385: interpreted mode (`-Xcomp`), [RunTest.gmk](../make/RunTests.gmk) >> >> Maybe >> Suggestion: >> >> interpreted mode (`-Xint`), [RunTest.gmk](../make/RunTests.gmk) > > Thanks for catching this fault of mine. I will update the text and change `interpreted mode`, as it is really `-Xcomp` we are looking at in the RunTest.gmk. yep, let use Xcomp, the Xint is not really supported, a lot of tests might start failing ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25122#discussion_r2080606853 From liach at openjdk.org Thu May 8 23:20:06 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 8 May 2025 23:20:06 GMT Subject: RFR: 8356548: Avoid using ASM to parse latest class files in tests [v2] In-Reply-To: References: Message-ID: > For early eval; test by changing the ClassReader max accepted version of test ASM to 24 instead of 25 Chen Liang has updated the pull request incrementally with one additional commit since the last revision: Use classfile api instead of javac setting version ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25124/files - new: https://git.openjdk.org/jdk/pull/25124/files/b8fa795e..cb02b045 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=00-01 Stats: 324 lines in 10 files changed: 30 ins; 180 del; 114 mod Patch: https://git.openjdk.org/jdk/pull/25124.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25124/head:pull/25124 PR: https://git.openjdk.org/jdk/pull/25124 From liach at openjdk.org Thu May 8 23:46:35 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 8 May 2025 23:46:35 GMT Subject: RFR: 8356548: Avoid using ASM to parse latest class files in tests [v3] In-Reply-To: References: Message-ID: > For early eval; test by changing the ClassReader max accepted version of test ASM to 24 instead of 25 Chen Liang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' of https://github.com/openjdk/jdk into fix/asm-test-upgrade - Use classfile api instead of javac setting version - Merge branch 'master' of https://github.com/openjdk/jdk into fix/asm-test-upgrade - 8356548: Avoid using ASM to parse latest class files in tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25124/files - new: https://git.openjdk.org/jdk/pull/25124/files/cb02b045..3114f523 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=01-02 Stats: 2043 lines in 169 files changed: 705 ins; 957 del; 381 mod Patch: https://git.openjdk.org/jdk/pull/25124.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25124/head:pull/25124 PR: https://git.openjdk.org/jdk/pull/25124 From lmesnik at openjdk.org Fri May 9 00:25:52 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 9 May 2025 00:25:52 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v3] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 22:07:40 GMT, Serguei Spitsyn wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added comment with clarification Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25060#pullrequestreview-2826655094 From lmesnik at openjdk.org Fri May 9 00:55:54 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 9 May 2025 00:55:54 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: <2oUdV1ca6y_cEHL3kTptk3jAlwCwnvsLGhRIAhVEUo8=.010c2226-c5ec-4508-be7f-90d244b2b7dc@github.com> On Thu, 8 May 2025 16:43:10 GMT, Leo Korinth wrote: >> This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). >> >> The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. >> >> In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). >> >> My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. >> >> These fixes have been created when I have plown through testcases: >> JDK-8352719: Add an equals sign to the modules statement >> JDK-8352709: Remove bad timing annotations from WhileOpTest.java >> JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test >> CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE >> CODETOOLS-7903961: Make default timeout configurable >> >> Sometime in the future I will also fix: >> 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 >> >> for which I am awaiting: >> CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 >> >> *After the review I will revert the two first commits, and update the copyrights* > > After I revert the two first commits and go back to a timeout factor of 4, I will run tier 1-8 again. @lkorinth Can you please proposed fix for https://bugs.openjdk.org/browse/JDK-8260555 to make it more clear the complete goal of the fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2864795990 From dlong at openjdk.org Fri May 9 01:01:03 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 9 May 2025 01:01:03 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock In-Reply-To: References: Message-ID: On Thu, 8 May 2025 13:22:55 GMT, Manuel H?ssig wrote: > # Issue Summary > > This PR addresses an `assert(bb->is_reachable())` that is triggered in the code for `-XX:+VerifyStack` after a deoptimization with reason `null_assert_or_unreached0` at a `getstatic` bytecode. Following the `getstatic` is an `areturn` and then an unreachable bytecode. When the code for `VerifyStack` tries to compute an oop map for the basic block of the unreachable bytecode, the assert triggers: > > getstatic Field A.val:"LB"; // if class B is not loaded, C2 deopts with reason "null_assert_or_unreached0" > areturn; > // The following is unreachable > iconst_0; > > > This is a similar problem to [JDK-8271055](https://bugs.openjdk.org/browse/JDK-8271055) (#7331), but this particular deopt with reason `null_assert_or_unreached0` at `getstatic` of a field containing an object reference [deopts at the next bytecode](https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/opto/parse3.cpp#L176-L199). The aforementioned issue introduced a check to skip stack verification of the next bytecode in the code if the execution after the deopted bytecode does not continue at the next bytecode in the code, i.e. falls through to the next bytecode. Unfortunately, this check did not include `areturn` as a bytecode that does not fall-through: > https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/runtime/deoptimization.cpp#L845-L856 > > # Change Summary > > To fix the immediate issue described above, this PR adds `areturn` to the list of bytecodes that does not fall through. However, all return bytecodes exhibit the same behavior and might be susceptible to a similar issue. Even though I was not able to reproduce the same crash with `{d,f,i,l}return` because I could not get those or the preceding bytecode to deopt, I also added them to the `falls_through()` function. For the remaining bytecodes in `falls_through()` with the exception of `athrow` I wrote a regression test. > > # Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/14595928439) > - [ ] tier1 through tier3 on Oracle supported platforms and OSs plus Oracle internal testing > > # Acknowledgements > Special thanks to @eme64 for his hard work on reducing a reproducer that works on all platforms. Making falls_through() handle all cases, including lookupswitch and tableswitch seems like the right fix. When I added it originally, I was not aware that C2 could set the bci to the next instruction instead of the current instruction. I think this means almost any instruction could be encountered at the "next next" bci. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25118#issuecomment-2864799977 From dholmes at openjdk.org Fri May 9 01:02:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 May 2025 01:02:02 GMT Subject: RFR: 8354969: Add strdup function for ResourceArea In-Reply-To: <7RWd7cVKqTbDkFyVdiLyHLFIUAwiSOMipKzGny-QRH8=.5c55c562-3742-4dd8-9131-73c5140cdf86@github.com> References: <7RWd7cVKqTbDkFyVdiLyHLFIUAwiSOMipKzGny-QRH8=.5c55c562-3742-4dd8-9131-73c5140cdf86@github.com> Message-ID: On Fri, 2 May 2025 09:56:32 GMT, Anton Artemov wrote: > Added a strdup() method, as requested by the bug reporter. The method is added to Arena, but also available in ResourceArea, as requested. A test for the method is provided. > > Testing: tiers 1-3 on multiple platforms. Do we have any immediate candidate uses for this? Whilst in the past Hotspot code had a number of utility API's, these days there is a tendency to delete any unused code. So this really needs to have an imminent usage (@iklam ?) . Code change itself looks fine. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24998#pullrequestreview-2826685566 From iklam at openjdk.org Fri May 9 02:25:50 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 02:25:50 GMT Subject: RFR: 8354969: Add strdup function for ResourceArea In-Reply-To: References: <7RWd7cVKqTbDkFyVdiLyHLFIUAwiSOMipKzGny-QRH8=.5c55c562-3742-4dd8-9131-73c5140cdf86@github.com> Message-ID: On Fri, 9 May 2025 00:58:46 GMT, David Holmes wrote: > Do we have any immediate candidate uses for this? Whilst in the past Hotspot code had a number of utility API's, these days there is a tendency to delete any unused code. So this really needs to have an imminent usage (@iklam ?) . > > Code change itself looks fine. Thanks I did this and found a few places, but there could be more find . -name *pp | xargs grep -l strcpy | \ xargs egrep -l '(RESOURCE.*char)|(resource_allocate_bytes)' | \ xargs grep strcpy https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/prims/jvmtiEnvBase.cpp#L471-L472 https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/classfile/classLoader.cpp#L1513-L1514 https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/classfile/modules.cpp#L642-L643 ------------- PR Comment: https://git.openjdk.org/jdk/pull/24998#issuecomment-2864904152 From iklam at openjdk.org Fri May 9 03:10:41 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 03:10:41 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes [v3] In-Reply-To: References: Message-ID: > This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). > > AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) > > In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. > > I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - Merge branch 'master' into 8354890-aot-init-methodhandleimpl-and-inner-classes - Comments from @liach and @ExE-Boss - Added more test case to increase coverage on possible core-lib usage patterns for MethodHandles - Merge branch 'master' into 8354890-aot-init-methodhandleimpl-and-inner-classes - 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes - @fisk comment -- use proper HeapAccess to load referent; Also refactor AOTReferenceObjSupport::is_enabled() - Merge branch 'master' into 8354897-soft-weak-references-in-aot-cache - @fisk offline comments -- tighten up and simplify eligibility check; @DanHeidinga comment -- renamed to MethodType::assemblySetup() - @DanHeidinga comments - @fisk comment - ... and 10 more: https://git.openjdk.org/jdk/compare/9a0e6f33...0571ddc3 ------------- Changes: https://git.openjdk.org/jdk/pull/24956/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24956&range=02 Stats: 181 lines in 8 files changed: 169 ins; 5 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24956.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24956/head:pull/24956 PR: https://git.openjdk.org/jdk/pull/24956 From iklam at openjdk.org Fri May 9 03:59:08 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 03:59:08 GMT Subject: RFR: 8356595 convert cds log to aot log Message-ID: This is an alternative (and opposite) approach to https://github.com/openjdk/jdk/pull/24895. We basically convert most `[cds]` logs to `[aot]` logs. However, for the few logs that might be needed by existing user scripts, we use macros like `aot_log_info`, `aot_log_debug` so that they can be selected/printed using the `[cds]` tag. We have a few hundred logs that start with `[cds]`. To aid reviewing, this PR will convert only part of them. I will create a second PR that coverts the rest of the logs. Please see **aotLogging.hpp** for how the macros work. ------------- Commit messages: - @stefank suggestions - Merge remote-tracking branch '8355638-xlog-aot-as-alias-for-xlog-cds' into 8355638-xlog-aot-as-alias-for-xlog-cds-alt-impl - Removed checks for error message that got removed from the PR - Reverted unrelated changes in filemap.cpp - @vnkozlov and @dholmes-ora comments - Merge branch 'master' into 8355638-xlog-aot-as-alias-for-xlog-cds - cds+aot+load -> aot+load - Merge branch 'master' into 8355638-xlog-aot-as-alias-for-xlog-cds - @jdksjolen comment - Fixed comment - ... and 10 more: https://git.openjdk.org/jdk/compare/52a5583d...b7670bf0 Changes: https://git.openjdk.org/jdk/pull/25136/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25136&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356595 Stats: 695 lines in 42 files changed: 320 ins; 13 del; 362 mod Patch: https://git.openjdk.org/jdk/pull/25136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25136/head:pull/25136 PR: https://git.openjdk.org/jdk/pull/25136 From amitkumar at openjdk.org Fri May 9 04:17:41 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 9 May 2025 04:17:41 GMT Subject: RFR: 8350482: [s390x] Relativize esp in interpreter frames Message-ID: <7_QA4gQn-iCpc9fJs3Y8SHm5glB77l2V89_7wHFWPKg=.75323465-9473-419b-89e7-bcdf7ccf6f41@github.com> Relativize esp in interpreter frames. This PR depends on https://github.com/openjdk/jdk/pull/23660 and couple of merge conflicts I expect from https://github.com/openjdk/jdk/pull/23690 and https://github.com/openjdk/jdk/pull/23708. ------------- Commit messages: - Z_R0 usage - cleanup - Merge branch 'master' into esp_rel - sign extension - esp relativised Changes: https://git.openjdk.org/jdk/pull/23724/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23724&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8350482 Stats: 32 lines in 5 files changed: 17 ins; 8 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/23724.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23724/head:pull/23724 PR: https://git.openjdk.org/jdk/pull/23724 From dholmes at openjdk.org Fri May 9 04:57:50 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 May 2025 04:57:50 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* My biggest concern with this change is that potentially any test that implicitly uses the default timeout, and which passes with no problem with a factor of 4, may now need to have an explicit timeout set due to the factor of 1. I see this change in a number of tests (default timeout of 120s expressed as a new timeout of 480s to maintain equivalence**), but how many times did you run the tiers? I fear that the gatekeepers will be playing timeout whack-a-mole once these changes are put in. ** though another option would be to update the jtreg default timeout instead. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865099247 From dholmes at openjdk.org Fri May 9 05:14:00 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 May 2025 05:14:00 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Tue, 6 May 2025 20:32:05 GMT, Coleen Phillimore wrote: > Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. > Tested with tier1-4, and tier1 on all Oracle-supported OSs. This seems reasonable but I have a few minor queries. It was surprising how many files included `threadCritical.hpp` for no apparent reason. Thanks src/hotspot/share/memory/arena.cpp line 47: > 45: void Arena::initialize_chunk_pool() { > 46: _global_chunk_pool_mutex = new PlatformMutex(); > 47: } Possibly a candidate for @jdksjolen 's `Deferred`? src/hotspot/share/memory/arena.cpp line 219: > 217: pool->return_to_pool(c); > 218: } else { > 219: // Free chunks under NMT lock so that NMT adjustment is stable. NMT lock??? src/hotspot/share/nmt/nmtUsage.cpp line 55: > 53: > 54: void NMTUsage::update_malloc_usage() { > 55: // Lock needed keep values in sync, total area size Suggestion: // Lock needed to keep values in sync, total area size src/hotspot/share/runtime/threads.cpp line 463: > 461: > 462: // Initialize memory pools > 463: Arena::initialize_chunk_pool(); What code first uses a `ChunkPool` or the `ChunkPoolLocker`? (It isn't obvious at what point the NMT code may execute during early initialization.) ------------- PR Review: https://git.openjdk.org/jdk/pull/25072#pullrequestreview-2827024851 PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080935325 PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080935989 PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080937284 PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080944116 From jbhateja at openjdk.org Fri May 9 05:31:57 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 9 May 2025 05:31:57 GMT Subject: Integrated: 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 12:28:55 GMT, Jatin Bhateja wrote: > This is a follow-up PR that fixes the crashes seen after the integration of PR #24664 > > ZGC bookkeeps multiple place holders in barrier code snippets through relocations, these are later used to patch appropriate contents (mostly immediate values) in instruction encoding to save costly comparisons against global state [1]. While most of the relocation records the patching offsets from the end of the instruction, SHL/R instructions used for pointer coloring/uncoloring, compute the patching offset from the starting address of the instruction. This was done to prevent accidental sharing of relocation information with subsequent relocatable instructions, e.g., static call. [2] > > In case the destination register operand of SHL/R instruction is an extended GPR register, we miss accounting additional REX2 prefix byte in the patch offset, thereby corrupting the encoding since runtime patches the primary opcode byte, resulting in an ILLEGAL instruction exception. > > This patch fixes reported failures by computing the relocation offset of the SHL/R instruction from the end of the instruction, thereby making the patch offset agnostic to the REX/REX2 prefix. To be safe, we emit a NOP instruction between the SHL/R and the subsequent relocatable instruction. > > Please review and share your feedback. > > Best Regards, > Jatin > > [1] https://openjdk.org/jeps/439#:~:text=we%20reduce%20this,changes%20phase%3B > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1873 > > > PS: Validations were performed using the latest Intel Software Development Emulator after modifying the static register allocation order in x86_64.ad file giving preference to EGPRs. This pull request has now been integrated. Changeset: 53ad4b2a Author: Jatin Bhateja URL: https://git.openjdk.org/jdk/commit/53ad4b2ad2664e5056c113543dfaa26647d6ce26 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod 8355364: [REDO] Missing REX2 prefix accounting in ZGC barriers leads to incorrect encoding Co-authored-by: Axel Boldt-Christmas Reviewed-by: aboldtch, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/24919 From dholmes at openjdk.org Fri May 9 05:45:50 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 May 2025 05:45:50 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix In-Reply-To: References: Message-ID: On Wed, 7 May 2025 12:29:00 GMT, Johan Sj?len wrote: > The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` src/hotspot/share/oops/resolvedIndyEntry.hpp line 120: > 118: u1 old_flags = _flags & ~(1 << has_appendix_shift); > 119: // Preserve the unaffected bits > 120: _flags = old_flags | new_flags; I may be having a mental blank this late in the week, but why do we need to do anything other than: _flags |= new_flags; ? `new_flags` should at most have one bit set (the appendix bit) and OR'ing with zero preserves all other bits. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25092#discussion_r2080972225 From iklam at openjdk.org Fri May 9 06:03:02 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 06:03:02 GMT Subject: RFR: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes [v3] In-Reply-To: References: Message-ID: On Mon, 5 May 2025 23:50:41 GMT, Chen Liang wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: >> >> - Merge branch 'master' into 8354890-aot-init-methodhandleimpl-and-inner-classes >> - Comments from @liach and @ExE-Boss >> - Added more test case to increase coverage on possible core-lib usage patterns for MethodHandles >> - Merge branch 'master' into 8354890-aot-init-methodhandleimpl-and-inner-classes >> - 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes >> - @fisk comment -- use proper HeapAccess to load referent; Also refactor AOTReferenceObjSupport::is_enabled() >> - Merge branch 'master' into 8354897-soft-weak-references-in-aot-cache >> - @fisk offline comments -- tighten up and simplify eligibility check; @DanHeidinga comment -- renamed to MethodType::assemblySetup() >> - @DanHeidinga comments >> - @fisk comment >> - ... and 10 more: https://git.openjdk.org/jdk/compare/9a0e6f33...0571ddc3 > > The Java code change and the BSM coverage looks good to me. Requiring another reviewer for hotspot changes. Thanks @liach @iwanowww for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/24956#issuecomment-2865231404 From iklam at openjdk.org Fri May 9 06:03:02 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 06:03:02 GMT Subject: Integrated: 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 22:59:29 GMT, Ioi Lam wrote: > This is a general fix for all the "points to a static field that may hold a different value" failures related to `java/lang/invoke/MethodHandleImpl`. E.g., [JDK-8354840](https://bugs.openjdk.org/browse/JDK-8354840), [JDK-8353330](https://bugs.openjdk.org/browse/JDK-8353330). > > AOT-cached method handles quite often refer to the static fields in `MethodHandleImpl` or its inner classes. In the production run, if the value of these static field changes, we may have unexpected behavior related to identity of objects in these static fields. `CDSHeapVerifier` makes a very conservative check for such static fields, but sometimes gives false positives (as in the above two JBS issues) > > In this PR, we AOT-initialize `MethodHandleImpl` and its inner classes. This is a more authentic snapshot of the state of `java.lang.invoke` during the assembly phase. We also avoid the need to add and maintain entries in the `cdsHeapVerifier.cpp` table. > > I also added more code in `MethodHandleTest.java` to simulate potential usage patterns of `MethodHandle` by the Java core libraries. Hopefully this will reduce the likelihood for innocent core lib changes breaking the AOT assembly phase. This pull request has now been integrated. Changeset: 591e71eb Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/591e71ebe501e6e88249f46efda4134277f29b08 Stats: 181 lines in 8 files changed: 169 ins; 5 del; 7 mod 8354890: AOT-initialize j.l.i.MethodHandleImpl and inner classes Reviewed-by: liach, vlivanov ------------- PR: https://git.openjdk.org/jdk/pull/24956 From iklam at openjdk.org Fri May 9 06:27:38 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 06:27:38 GMT Subject: RFR: 8356595: Convert -Xlog:cds to -Xlog:aot [v2] In-Reply-To: References: Message-ID: <4aTAnrjgLBVGyOkzWub87lIMm8YxuA6HWAs9JTU2JJk=.967f32e8-0577-4817-b0c5-964f4ac79566@github.com> > This is an alternative (and opposite) approach to https://github.com/openjdk/jdk/pull/24895. We basically convert most `[cds]` logs to `[aot]` logs. However, for the few logs that might be needed by existing user scripts, we use macros like `aot_log_info`, `aot_log_debug` so that they can be selected/printed using the `[cds]` tag. > > We have a few hundred logs that start with `[cds]`. To aid reviewing, this PR will convert only part of them. I will create a second PR that coverts the rest of the logs. > > Please see **aotLogging.hpp** for how the macros work. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Fixed macos build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25136/files - new: https://git.openjdk.org/jdk/pull/25136/files/b7670bf0..dbab9a7e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25136&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25136&range=00-01 Stats: 5 lines in 2 files changed: 1 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25136/head:pull/25136 PR: https://git.openjdk.org/jdk/pull/25136 From stuefe at openjdk.org Fri May 9 06:51:51 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 9 May 2025 06:51:51 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Tue, 6 May 2025 20:32:05 GMT, Coleen Phillimore wrote: > Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. > Tested with tier1-4, and tier1 on all Oracle-supported OSs. I like that we use PlatformMutex, not Mutex; I think that is a very reasonable approach. We forego deadlock detection but gain independence wrt initialization order. Note that if you use PlatformMutex, there is no reason to dynamically allocate the mutex; just allocate as a global variable. There is a small danger associated with this proposal in that ThreadCritical was allowing for recursive entry, PlatformMutex does not. On Posix it uses pthread_mutex_xxx, which on some platforms has the ability to work recursively; but on Windows we use critical sections, there its not possible. (we could use Windows Mutex instead). Why not add a debug-only information to the Locker class to hold the owning thread id, and assert on enter with my own? But actually the usage of the Locker is rather safe; The only real concern with potential recursive entry is with guarding os::free, and with potential deadlocks should we crash inside a signal handler (error handling or AsyncGetCallTrace or that new "CPU Time Profiling for JFR" JEP). For these signal handler usages, I had this idea: https://bugs.openjdk.org/browse/JDK-8349578 - a double-buffering approach where we, upon entering the signal handler, open up a new ResourceArea for the current thread, and return to the original RA upon leaving signal handling. About guarding os::free, that is needed for a dreary issue (see https://bugs.openjdk.org/browse/JDK-8325890). In short, it is needed to get stable arena memory readings. However, I would love to just get rid of arena memory accounting altogether; the solution is very hackish and probably not needed anymore. The only heavy user of chunks causing repeated problems that caused me to use NMT at customers was the JIT. And I added the compilation memory statistic last year, so we have not a much much better tool to investigate JIT arena usage. We also have an issue https://bugs.openjdk.org/browse/JDK-8333151, which investigates whether we can get rid of chunkpool altogether; see the ratio given in that issue description. This gives some possible performance benefits apart from simplification, but comes at a significant risk of backward compatibility problems since we are then much more subject to memory retention policy of the glibc. --- Bottom line, I like this PR and if possible would add some debug-only way to catch recursive entry. src/hotspot/share/nmt/nmtUsage.cpp line 58: > 56: // is deducted from mtChunk in the end to give correct values. > 57: ChunkPoolLocker lock; > 58: const MallocMemorySnapshot* ms = MallocMemorySummary::as_snapshot(); I'd scope this to the as_snapshot call, the subsequent code does not need the locking (lesser chance of accidental recursive locking) ------------- PR Review: https://git.openjdk.org/jdk/pull/25072#pullrequestreview-2827109155 PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080999219 From stuefe at openjdk.org Fri May 9 06:51:52 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 9 May 2025 06:51:52 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Fri, 9 May 2025 04:58:51 GMT, David Holmes wrote: >> Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. >> Tested with tier1-4, and tier1 on all Oracle-supported OSs. > > src/hotspot/share/memory/arena.cpp line 47: > >> 45: void Arena::initialize_chunk_pool() { >> 46: _global_chunk_pool_mutex = new PlatformMutex(); >> 47: } > > Possibly a candidate for @jdksjolen 's `Deferred`? Why do we even need dynamic initialization here? I thought that is the tradeoff with PlatformMutex: you forego deadlock checks etc, but you gain safety wrt initialisation. That would also save some instructions. > src/hotspot/share/runtime/threads.cpp line 463: > >> 461: >> 462: // Initialize memory pools >> 463: Arena::initialize_chunk_pool(); > > What code first uses a `ChunkPool` or the `ChunkPoolLocker`? (It isn't obvious at what point the NMT code may execute during early initialization.) Any Arena that is created allocates an initial chunk. Arenas are created during (any) Thread creation, for compiler initialization and in universe::init, among others. So this is needed early. But as I wrote above, I don't think we need dynamic initialization at all? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081004650 PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080987856 From dfuchs at openjdk.org Fri May 9 07:18:52 2025 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Fri, 9 May 2025 07:18:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 17:41:02 GMT, Daniel Fuchs wrote: > Thank you. I have imported your PR locally and running some HTTP client tests in the CI. > Tests have not finished running - but I already see one intermittent failure: > `java/net/httpclient/RedirectTimeoutTest.java` is timing out intermittently on windows. > It would be good to flush out any such intermittent failures before this PR is integrated. > This might require multiple runs before we can get confidence. Results came back - another intermittent timeout failure (much more frequent) observed in: `java/net/httpclient/CancelledResponse.java` on macOS x64 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865413003 From cstein at openjdk.org Fri May 9 07:36:51 2025 From: cstein at openjdk.org (Christian Stein) Date: Fri, 9 May 2025 07:36:51 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Fri, 9 May 2025 04:54:52 GMT, David Holmes wrote: > [...] > ** though another option would be to update the jtreg default timeout instead. And affect all other tests, too? I'd rather let the default stay on the former hard-coded 120s value. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865469908 From rvansa at openjdk.org Fri May 9 07:38:54 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Fri, 9 May 2025 07:38:54 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Wed, 7 May 2025 19:18:46 GMT, Coleen Phillimore wrote: > Compressing the fields into unsigned5 and decoding them into streams was quite a complicated change but manageable because the interface to decode them is all one has write the FieldStream iterator. This is hard to review. Not sure if I got you right, but I agree that the interface should not be changed (significantly). Here I am adding `skip_fields_until` method for one place; to me it was a bit surprising that there was not focus on (sub-linear) lookup from the beginning. > I'm wondering how much of a problem this is in real code, other than the case with 21k fields and if there's a way to programmatically work around this case, like decompress the fields into a hashtable or something (?) It would be interesting to see some histograms of some corpus Java code (maybe put this info in the associated bug). The customer code that hit the regression looked to me as something generated, probably already working around some sizing limits, and the problem was in an initialization routine setting up thousands of descriptors. I am not saying that it could not be reworked for the better, but for someone this is an order of magnitude regression and I understand that they demand fix on JDK side. What kind of histogram would you imagine? I could count the number of fields in those 10 classes... > Fred tells me that we already store the original field index so maybe above is moot. Could you be more specific, please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2865480638 From xgong at openjdk.org Fri May 9 07:44:27 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 9 May 2025 07:44:27 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API Message-ID: JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). Two key areas require improvement: 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. Main changes: 1. Java-side API refactoring: - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on architectures like AArch64. 2. C2 compiler IR refactoring: - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. 3. Backend changes: - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. Performance: The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: Benchmark Mode Cnt Unit SIZE Before After Gain GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 GatherOperationsBenchmark.microByteGather128_MASK thrpt 30 ops/ms 64 43040.148 44605.580 1.03 GatherOperationsBenchmark.microByteGather128_MASK thrpt 30 ops/ms 256 12445.650 12928.102 1.03 GatherOperationsBenchmark.microByteGather128_MASK thrpt 30 ops/ms 1024 3143.728 3294.173 1.04 GatherOperationsBenchmark.microByteGather128_MASK thrpt 30 ops/ms 4096 801.516 842.951 1.05 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF thrpt 30 ops/ms 64 40379.343 45255.490 1.12 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF thrpt 30 ops/ms 256 11103.537 12971.581 1.16 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF thrpt 30 ops/ms 1024 2767.870 3299.453 1.19 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF thrpt 30 ops/ms 4096 704.610 840.908 1.19 GatherOperationsBenchmark.microByteGather128_NZ_OFF thrpt 30 ops/ms 64 49066.340 53365.591 1.08 GatherOperationsBenchmark.microByteGather128_NZ_OFF thrpt 30 ops/ms 256 14063.326 14286.067 1.01 GatherOperationsBenchmark.microByteGather128_NZ_OFF thrpt 30 ops/ms 1024 3617.992 3621.272 1.00 GatherOperationsBenchmark.microByteGather128_NZ_OFF thrpt 30 ops/ms 4096 861.026 938.055 1.08 GatherOperationsBenchmark.microByteGather256 thrpt 30 ops/ms 64 55844.814 48311.847 0.86 GatherOperationsBenchmark.microByteGather256 thrpt 30 ops/ms 256 15139.459 13009.848 0.85 GatherOperationsBenchmark.microByteGather256 thrpt 30 ops/ms 1024 3861.834 3284.944 0.85 GatherOperationsBenchmark.microByteGather256 thrpt 30 ops/ms 4096 938.665 817.673 0.87 GatherOperationsBenchmark.microByteGather256_MASK thrpt 30 ops/ms 64 43942.924 43144.065 0.98 GatherOperationsBenchmark.microByteGather256_MASK thrpt 30 ops/ms 256 12461.170 11580.981 0.92 GatherOperationsBenchmark.microByteGather256_MASK thrpt 30 ops/ms 1024 3168.598 2945.698 0.92 GatherOperationsBenchmark.microByteGather256_MASK thrpt 30 ops/ms 4096 803.515 738.049 0.91 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF thrpt 30 ops/ms 64 42197.440 43209.913 1.02 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF thrpt 30 ops/ms 256 11456.761 11713.265 1.02 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF thrpt 30 ops/ms 1024 2732.576 2949.724 1.07 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF thrpt 30 ops/ms 4096 726.062 744.774 1.02 GatherOperationsBenchmark.microByteGather256_NZ_OFF thrpt 30 ops/ms 64 52915.781 49520.027 0.93 GatherOperationsBenchmark.microByteGather256_NZ_OFF thrpt 30 ops/ms 256 14481.921 13496.835 0.93 GatherOperationsBenchmark.microByteGather256_NZ_OFF thrpt 30 ops/ms 1024 3632.065 3362.372 0.92 GatherOperationsBenchmark.microByteGather256_NZ_OFF thrpt 30 ops/ms 4096 892.825 845.809 0.94 GatherOperationsBenchmark.microByteGather512 thrpt 30 ops/ms 64 54528.404 54478.751 0.99 GatherOperationsBenchmark.microByteGather512 thrpt 30 ops/ms 256 15018.181 14673.727 0.97 GatherOperationsBenchmark.microByteGather512 thrpt 30 ops/ms 1024 3824.690 3589.530 0.93 GatherOperationsBenchmark.microByteGather512 thrpt 30 ops/ms 4096 923.601 906.245 0.98 GatherOperationsBenchmark.microByteGather512_MASK thrpt 30 ops/ms 64 41248.192 42201.455 1.02 GatherOperationsBenchmark.microByteGather512_MASK thrpt 30 ops/ms 256 11481.408 11559.655 1.00 GatherOperationsBenchmark.microByteGather512_MASK thrpt 30 ops/ms 1024 2901.592 2912.954 1.00 GatherOperationsBenchmark.microByteGather512_MASK thrpt 30 ops/ms 4096 732.899 730.381 0.99 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF thrpt 30 ops/ms 64 42287.123 43779.227 1.03 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF thrpt 30 ops/ms 256 11486.167 11448.966 0.99 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF thrpt 30 ops/ms 1024 2888.047 2928.612 1.01 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF thrpt 30 ops/ms 4096 731.056 738.300 1.00 GatherOperationsBenchmark.microByteGather512_NZ_OFF thrpt 30 ops/ms 64 51777.670 54368.797 1.05 GatherOperationsBenchmark.microByteGather512_NZ_OFF thrpt 30 ops/ms 256 14558.532 14662.164 1.00 GatherOperationsBenchmark.microByteGather512_NZ_OFF thrpt 30 ops/ms 1024 3726.910 3714.448 0.99 GatherOperationsBenchmark.microByteGather512_NZ_OFF thrpt 30 ops/ms 4096 907.863 903.544 0.99 GatherOperationsBenchmark.microByteGather64 thrpt 30 ops/ms 64 52980.507 54970.689 1.03 GatherOperationsBenchmark.microByteGather64 thrpt 30 ops/ms 256 15044.443 15828.237 1.05 GatherOperationsBenchmark.microByteGather64 thrpt 30 ops/ms 1024 3869.028 4098.172 1.05 GatherOperationsBenchmark.microByteGather64 thrpt 30 ops/ms 4096 912.372 1002.065 1.09 GatherOperationsBenchmark.microByteGather64_MASK thrpt 30 ops/ms 64 44267.641 45864.381 1.03 GatherOperationsBenchmark.microByteGather64_MASK thrpt 30 ops/ms 256 12303.206 12920.113 1.05 GatherOperationsBenchmark.microByteGather64_MASK thrpt 30 ops/ms 1024 3100.867 3115.636 1.00 GatherOperationsBenchmark.microByteGather64_MASK thrpt 30 ops/ms 4096 792.004 832.623 1.05 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF thrpt 30 ops/ms 64 40417.638 45844.634 1.13 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF thrpt 30 ops/ms 256 11628.508 12913.170 1.11 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF thrpt 30 ops/ms 1024 2911.508 3260.388 1.11 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF thrpt 30 ops/ms 4096 709.017 835.084 1.17 GatherOperationsBenchmark.microByteGather64_NZ_OFF thrpt 30 ops/ms 64 48868.987 53585.210 1.09 GatherOperationsBenchmark.microByteGather64_NZ_OFF thrpt 30 ops/ms 256 13617.963 15754.029 1.15 GatherOperationsBenchmark.microByteGather64_NZ_OFF thrpt 30 ops/ms 1024 3504.745 3857.926 1.10 GatherOperationsBenchmark.microByteGather64_NZ_OFF thrpt 30 ops/ms 4096 818.439 958.751 1.17 GatherOperationsBenchmark.microShortGather128 thrpt 30 ops/ms 64 41351.719 44337.947 1.07 GatherOperationsBenchmark.microShortGather128 thrpt 30 ops/ms 256 11175.501 12302.557 1.10 GatherOperationsBenchmark.microShortGather128 thrpt 30 ops/ms 1024 2854.546 3158.973 1.10 GatherOperationsBenchmark.microShortGather128 thrpt 30 ops/ms 4096 744.816 790.304 1.06 GatherOperationsBenchmark.microShortGather128_MASK thrpt 30 ops/ms 64 35012.934 35728.068 1.02 GatherOperationsBenchmark.microShortGather128_MASK thrpt 30 ops/ms 256 9408.162 9854.849 1.04 GatherOperationsBenchmark.microShortGather128_MASK thrpt 30 ops/ms 1024 2352.723 2489.161 1.05 GatherOperationsBenchmark.microShortGather128_MASK thrpt 30 ops/ms 4096 595.827 634.225 1.06 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF thrpt 30 ops/ms 64 31405.646 35728.077 1.13 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF thrpt 30 ops/ms 256 8459.702 9865.482 1.16 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF thrpt 30 ops/ms 1024 2095.461 2489.927 1.18 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF thrpt 30 ops/ms 4096 535.715 631.614 1.17 GatherOperationsBenchmark.microShortGather128_NZ_OFF thrpt 30 ops/ms 64 39996.604 43811.259 1.09 GatherOperationsBenchmark.microShortGather128_NZ_OFF thrpt 30 ops/ms 256 11058.636 12261.463 1.10 GatherOperationsBenchmark.microShortGather128_NZ_OFF thrpt 30 ops/ms 1024 2847.482 3157.450 1.10 GatherOperationsBenchmark.microShortGather128_NZ_OFF thrpt 30 ops/ms 4096 712.089 790.143 1.10 GatherOperationsBenchmark.microShortGather256 thrpt 30 ops/ms 64 51893.730 51975.295 1.00 GatherOperationsBenchmark.microShortGather256 thrpt 30 ops/ms 256 14226.104 14720.390 1.03 GatherOperationsBenchmark.microShortGather256 thrpt 30 ops/ms 1024 3491.958 3714.266 1.06 GatherOperationsBenchmark.microShortGather256 thrpt 30 ops/ms 4096 852.278 905.330 1.06 GatherOperationsBenchmark.microShortGather256_MASK thrpt 30 ops/ms 64 38736.351 41797.516 1.07 GatherOperationsBenchmark.microShortGather256_MASK thrpt 30 ops/ms 256 10250.508 11790.235 1.15 GatherOperationsBenchmark.microShortGather256_MASK thrpt 30 ops/ms 1024 2558.449 2956.936 1.15 GatherOperationsBenchmark.microShortGather256_MASK thrpt 30 ops/ms 4096 648.882 745.885 1.14 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF thrpt 30 ops/ms 64 38315.594 39547.847 1.03 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF thrpt 30 ops/ms 256 10471.955 11779.499 1.12 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF thrpt 30 ops/ms 1024 2618.623 2679.970 1.02 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF thrpt 30 ops/ms 4096 655.803 760.392 1.15 GatherOperationsBenchmark.microShortGather256_NZ_OFF thrpt 30 ops/ms 64 47674.080 51325.185 1.07 GatherOperationsBenchmark.microShortGather256_NZ_OFF thrpt 30 ops/ms 256 13446.700 14438.516 1.07 GatherOperationsBenchmark.microShortGather256_NZ_OFF thrpt 30 ops/ms 1024 3371.433 3664.720 1.08 GatherOperationsBenchmark.microShortGather256_NZ_OFF thrpt 30 ops/ms 4096 814.540 895.182 1.09 GatherOperationsBenchmark.microShortGather512 thrpt 30 ops/ms 64 48183.553 48374.790 1.01 GatherOperationsBenchmark.microShortGather512 thrpt 30 ops/ms 256 13669.806 12940.433 0.94 GatherOperationsBenchmark.microShortGather512 thrpt 30 ops/ms 1024 3371.708 3318.627 0.98 GatherOperationsBenchmark.microShortGather512 thrpt 30 ops/ms 4096 847.620 805.313 0.95 GatherOperationsBenchmark.microShortGather512_MASK thrpt 30 ops/ms 64 39566.443 42845.296 1.08 GatherOperationsBenchmark.microShortGather512_MASK thrpt 30 ops/ms 256 11926.440 10308.223 0.86 GatherOperationsBenchmark.microShortGather512_MASK thrpt 30 ops/ms 1024 3008.542 2546.197 0.84 GatherOperationsBenchmark.microShortGather512_MASK thrpt 30 ops/ms 4096 764.497 647.276 0.84 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF thrpt 30 ops/ms 64 38106.800 42835.120 1.12 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF thrpt 30 ops/ms 256 10405.171 11125.164 1.06 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF thrpt 30 ops/ms 1024 2526.827 2799.209 1.10 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF thrpt 30 ops/ms 4096 655.044 715.519 1.09 GatherOperationsBenchmark.microShortGather512_NZ_OFF thrpt 30 ops/ms 64 48108.682 46654.427 0.96 GatherOperationsBenchmark.microShortGather512_NZ_OFF thrpt 30 ops/ms 256 13197.197 12957.497 0.98 GatherOperationsBenchmark.microShortGather512_NZ_OFF thrpt 30 ops/ms 1024 3397.959 3244.415 0.95 GatherOperationsBenchmark.microShortGather512_NZ_OFF thrpt 30 ops/ms 4096 824.034 820.536 0.99 GatherOperationsBenchmark.microShortGather64 thrpt 30 ops/ms 64 44815.622 46913.289 1.04 GatherOperationsBenchmark.microShortGather64 thrpt 30 ops/ms 256 12317.166 13536.731 1.09 GatherOperationsBenchmark.microShortGather64 thrpt 30 ops/ms 1024 3157.683 3539.991 1.12 GatherOperationsBenchmark.microShortGather64 thrpt 30 ops/ms 4096 775.626 878.304 1.13 GatherOperationsBenchmark.microShortGather64_MASK thrpt 30 ops/ms 64 37064.157 35649.776 0.96 GatherOperationsBenchmark.microShortGather64_MASK thrpt 30 ops/ms 256 10120.291 9403.1319 0.92 GatherOperationsBenchmark.microShortGather64_MASK thrpt 30 ops/ms 1024 2546.723 2642.781 1.03 GatherOperationsBenchmark.microShortGather64_MASK thrpt 30 ops/ms 4096 644.270 648.432 1.00 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF thrpt 30 ops/ms 64 34386.819 37883.550 1.10 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF thrpt 30 ops/ms 256 9316.097 10500.473 1.12 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF thrpt 30 ops/ms 1024 2344.570 2643.114 1.12 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF thrpt 30 ops/ms 4096 594.445 595.301 1.00 GatherOperationsBenchmark.microShortGather64_NZ_OFF thrpt 30 ops/ms 64 40240.772 48435.477 1.20 GatherOperationsBenchmark.microShortGather64_NZ_OFF thrpt 30 ops/ms 256 11082.392 13736.985 1.23 GatherOperationsBenchmark.microShortGather64_NZ_OFF thrpt 30 ops/ms 1024 2777.065 3549.704 1.27 GatherOperationsBenchmark.microShortGather64_NZ_OFF thrpt 30 ops/ms 4096 697.671 877.411 1.25 Note that this patch is splitted from https://github.com/openjdk/jdk/pull/24679. A follow-up PR will implement the SVE subword gather load operations after this PR is merged. [1] https://bugs.openjdk.org/browse/JDK-8318650 [2] https://developer.arm.com/documentation/ddi0602/2024-12/SVE-Instructions/LD1B--scalar-plus-vector---Gather-load-unsigned-bytes-to-vector--vector-index--?lang=en [3] https://developer.arm.com/documentation/ddi0602/2024-12/SVE-Instructions/LD1H--scalar-plus-vector---Gather-load-unsigned-halfwords-to-vector--vector-index--?lang=en ------------- Commit messages: - 8355563: VectorAPI: Refactor current implementation of subword gather load API Changes: https://git.openjdk.org/jdk/pull/25138/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25138&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355563 Stats: 441 lines in 15 files changed: 105 ins; 176 del; 160 mod Patch: https://git.openjdk.org/jdk/pull/25138.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25138/head:pull/25138 PR: https://git.openjdk.org/jdk/pull/25138 From xgong at openjdk.org Fri May 9 07:44:27 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 9 May 2025 07:44:27 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). > > Two key areas require improvement: > 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. > 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. > > This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. > > Main changes: > 1. Java-side API refactoring: > - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on > architectures like AArch64. > 2. C2 compiler IR refactoring: > - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. > 3. Backend changes: > - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. > > Performance: > The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: > > Benchmark Mode Cnt Unit SIZE Before After Gain > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 > GatherOperationsBenchmark.micr... Hi @eme64 , could you please help take a look at this PR, which is a part of https://github.com/openjdk/jdk/pull/24679 ? Thanks a lot in advance! Hi @jatin-bhateja , could you please kindly review this PR, especially the X86 codegen part? Thanks a lot in advance! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2865493287 PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2865495716 From jsjolen at openjdk.org Fri May 9 07:46:52 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 07:46:52 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v2] In-Reply-To: References: <8SRA1LqI3JjG4b8A3MvLdlkmRwsXNm3D72pvHqt6vfg=.6f0fc391-061e-48ce-8ba3-921b25ff8710@github.com> Message-ID: On Thu, 8 May 2025 19:08:26 GMT, Paul Sandoz wrote: > > It looks like @PaulSandoz wrote this code and could help review your change. I wish -4 etc had some const names instead and both versions look the same to me. > > I was the one who committed this code, but i don't recall writing it, and was likely written by another contributor. > > This is arguably now dead code, likely written at the time for generality in the expectation it might be used but eventually was not. I don't see any calls to this method from Java that passes in a negative start argument. This method supports accessing the prefix of known bootstrap arguments (bsm, name, type) + size information for the additional arguments, but these arguments are already available in Java and passed in via up call linkage (see [MethodHandleNatives.linkDynamicConstant](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/invoke/MethodHandleNatives.java#L308)). Do you think that the idea is that the ordinary arguments started at 0, and the static arguments referenced in the CP was accessed by going into negative indices? Let's see if this is dead code, I'll get rid of it and see what happens in the CI. When I read the Java code, it wasn't entirely obvious that we never pass in negative args. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24825#issuecomment-2865504838 From alanb at openjdk.org Fri May 9 08:12:52 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 9 May 2025 08:12:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 16:43:10 GMT, Leo Korinth wrote: >> This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). >> >> The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. >> >> In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). >> >> My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. >> >> These fixes have been created when I have plown through testcases: >> JDK-8352719: Add an equals sign to the modules statement >> JDK-8352709: Remove bad timing annotations from WhileOpTest.java >> JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test >> CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE >> CODETOOLS-7903961: Make default timeout configurable >> >> Sometime in the future I will also fix: >> 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 >> >> for which I am awaiting: >> CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 >> >> *After the review I will revert the two first commits, and update the copyrights* > > After I revert the two first commits and go back to a timeout factor of 4, I will run tier 1-8 again. @lkorinth Moving to a TIMEOUT_FACTOR of 1 seems a good goal. Would it be possible to expand a bit on what repeat testing was done to identify the tests to add /timeout ? If I read it correctly, any tests using /timeout=N have been to bumped to 4*N so no change. Most tests don't use /timeout so I assume many runs were done to identify the tests that would timeout with if there was no scaling. Test machines vary, as does the test execution times when running concurrently with other tests, so I think it would help if you could say a bit more, even to confirm that it was many test runs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865581927 From jsjolen at openjdk.org Fri May 9 08:21:56 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 08:21:56 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix In-Reply-To: References: Message-ID: On Fri, 9 May 2025 05:43:19 GMT, David Holmes wrote: >> The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` > > src/hotspot/share/oops/resolvedIndyEntry.hpp line 120: > >> 118: u1 old_flags = _flags & ~(1 << has_appendix_shift); >> 119: // Preserve the unaffected bits >> 120: _flags = old_flags | new_flags; > > I may be having a mental blank this late in the week, but why do we need to do anything other than: > > _flags |= new_flags; > > ? `new_flags` should at most have one bit set (the appendix bit) and OR'ing with zero preserves all other bits. Yeah, what you're saying sounds right to me. Let's compare the new, old and your code, together. u1 new_flags = (has_appendix << has_appendix_shift); <=> u1 new_flags = 0b00000010; u1 old_flags = _flags & ~(1 << has_appendix_shift) <=> u1 old_flags = _flags & ~(0b00000010) <=> u1 old_flags = _flags & 0b11111101 <=> u1old_flags = 0bXXXXXX0X X = unknown value, whatever was in _flags at that position before u1 _flags = old_flags | new_flags; <=> u1 _flags = 0bXXXXXX0X | 0b00000010; <=> u1 _flags = 0bXXXXXX1X; And let's compare that to the old code: u1 new_flags = (has_appendix << has_appendix_shift); <=> u1 new_flags = 0b00000010; u1 _flags = (_flags & 1) | new_flags <=> u1 _flags = 0x0000000X | 0b00000010 <=> u1 _flags = 0x0000001X; So the previous code cleared out all bits except the resolution flag, so if there were more bits in use this method would fail. Finally, let's look at your version. u1 new_flags = (has_appendix << has_appendix_shift); <=> u1 new_flags = 0b00000010; u1 _flags = _flags | new_flags; u1 _flags = 0bXXXXXXXX | 0b00000010; <=> u1 _flags = 0bXXXXXX1X; That's pretty clearly identical to the new version. Let's skip the implicit `bool` to `1` conversion while we're at it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25092#discussion_r2081174671 From jsjolen at openjdk.org Fri May 9 08:29:30 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 08:29:30 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v2] In-Reply-To: References: Message-ID: > The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Just do the obvious thing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25092/files - new: https://git.openjdk.org/jdk/pull/25092/files/06816db5..beccef03 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25092/head:pull/25092 PR: https://git.openjdk.org/jdk/pull/25092 From lkorinth at openjdk.org Fri May 9 08:32:55 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 08:32:55 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* I feel almost all of the comments raised here are for me changing the timeout factor to `1`. I will try to answer those questions here as well, but note that the timeout factor is not to be changed to `1` in this pull request and will remain 4, so excluding bugs I might have introduced, tiers would --- if anything --- be running more stable after the change as I have only increased timeouts. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865644966 From sspitsyn at openjdk.org Fri May 9 08:41:57 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 9 May 2025 08:41:57 GMT Subject: RFR: 8356251: Need minor cleanup for interp_only_mode [v3] In-Reply-To: References: Message-ID: On Thu, 8 May 2025 22:07:40 GMT, Serguei Spitsyn wrote: >> This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: >> - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. >> - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. >> >> Testing: >> - TBD: Mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added comment with clarification Thank you for review, Chris and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25060#issuecomment-2865679917 From lkorinth at openjdk.org Fri May 9 08:42:58 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 08:42:58 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Fri, 9 May 2025 07:14:11 GMT, Daniel Fuchs wrote: > Thank you. I have imported your PR locally and running some HTTP client tests in the CI. Tests have not finished running - but I already see one intermittent failure: `java/net/httpclient/RedirectTimeoutTest.java` is timing out intermittently on windows. It would be good to flush out any such intermittent failures before this PR is integrated. This might require multiple runs before we can get confidence. My change of timeout factor to `0.7` is only temporal, as I said: it will be reverted to `4` before integration. Naturally, a few test cases will timeout when I do this /temporal/ change, hopefully `java/net/httpclient/RedirectTimeoutTest.java` will behave well with a timeout factor of `1` instead of `0.7`, but note that I will revert the timeout factor to `4` before integration. The whole idea of running with a timeout factor of `0.7` is to remove intermittent failures. (I had it close to 0.5 or maybe less to begin with until I found and reported CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865685066 From stefank at openjdk.org Fri May 9 08:48:51 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 9 May 2025 08:48:51 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: <1Sa3h-gkyVwOLDdj_wJdFohAGYbhYhbAYIaqHCmW7oY=.3b58c23d-cf55-421b-aeec-e149809826f2@github.com> On Fri, 9 May 2025 08:40:44 GMT, Leo Korinth wrote: > My change of timeout factor to 0.7 is only temporal, as I said: it will be reverted to 4 before integration. Is really worth reverting back to 4 instead of trying to get jtreg released with CODETOOLS-7903961 and then integrate this change with a timeout factor of 1? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865697701 From lkorinth at openjdk.org Fri May 9 09:02:55 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 09:02:55 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: <0AKC-4omm-W24u11ifhGm8Do8_5sqwPMJz6q3A71FNE=.f4209d38-51d6-4eda-b11e-d670e5ee5575@github.com> On Thu, 8 May 2025 20:00:21 GMT, Phil Race wrote: > test/jdk/java/awt/font/NumericShaper/MTTest.java > > * * @run main/timeout=300/othervm MTTest > > > * * @run main/timeout=1200/othervm MTTest > > > I'm puzzling over why you saw this test fail with timeout = 300 .. or perhaps you saw it fail with 0.7 ? Which would amount to 210 seconds .. that might just be enough to cause it to fail because if you look at the whole test you'll see it wants the core loops of the test to run for 180 seconds. > > https://openjdk.github.io/cr/?repo=jdk&pr=25122&range=00#new-144-test/jdk/java/awt/font/NumericShaper/MTTest.java > > So 300 was fine, and 1200 isn't needed. I started with a timeout factor less than `0.7` but I got hindered by CODETOOLS-7903937. That is probably the reason. Maybe I should change the timeout to 400? I think it is reasonable to handle a timeout factor somewhat less than 1 to weed out tight test cases. But maybe 300 is good enough? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865742871 From lkorinth at openjdk.org Fri May 9 09:12:19 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 09:12:19 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: <0AKC-4omm-W24u11ifhGm8Do8_5sqwPMJz6q3A71FNE=.f4209d38-51d6-4eda-b11e-d670e5ee5575@github.com> References: <0AKC-4omm-W24u11ifhGm8Do8_5sqwPMJz6q3A71FNE=.f4209d38-51d6-4eda-b11e-d670e5ee5575@github.com> Message-ID: <3-AQ92Sgr9tDJIWDO5OX43uBDLndiDN_3jyRj5t2z6Q=.af7cef0d-c447-4401-b4e0-e11a9bdba35b@github.com> On Fri, 9 May 2025 08:58:15 GMT, Leo Korinth wrote: >> test/jdk/java/awt/font/NumericShaper/MTTest.java >> >> - * @run main/timeout=300/othervm MTTest >> + * @run main/timeout=1200/othervm MTTest >> >> I'm puzzling over why you saw this test fail with timeout = 300 .. or perhaps you saw it fail with 0.7 ? Which would amount to 210 seconds .. that might just be enough to cause it to fail because if you look at the whole test you'll see it wants the core loops of the test to run for 180 seconds. >> >> https://openjdk.github.io/cr/?repo=jdk&pr=25122&range=00#new-144-test/jdk/java/awt/font/NumericShaper/MTTest.java >> >> So 300 was fine, and 1200 isn't needed. > >> test/jdk/java/awt/font/NumericShaper/MTTest.java >> >> * * @run main/timeout=300/othervm MTTest >> >> >> * * @run main/timeout=1200/othervm MTTest >> >> >> I'm puzzling over why you saw this test fail with timeout = 300 .. or perhaps you saw it fail with 0.7 ? Which would amount to 210 seconds .. that might just be enough to cause it to fail because if you look at the whole test you'll see it wants the core loops of the test to run for 180 seconds. >> >> https://openjdk.github.io/cr/?repo=jdk&pr=25122&range=00#new-144-test/jdk/java/awt/font/NumericShaper/MTTest.java >> >> So 300 was fine, and 1200 isn't needed. > > I started with a timeout factor less than `0.7` but I got hindered by CODETOOLS-7903937. That is probably the reason. Maybe I should change the timeout to 400? I think it is reasonable to handle a timeout factor somewhat less than 1 to weed out tight test cases. But maybe 300 is good enough? > @lkorinth Moving to a TIMEOUT_FACTOR of 1 seems a good goal. Would it be possible to expand a bit on what repeat testing was done to identify the tests to add /timeout ? If I read it correctly, any tests using /timeout=N have been to bumped to 4*N so no change if the scaling is adjusted in the future. Most tests don't use /timeout so I assume many runs were done to identify the tests that would timeout with if there was no scaling. Test machines vary, as does the test execution times when running concurrently with other tests, so I think it would help if you could say a bit more, even to confirm that it was many test runs. The code was run as it currently looks (with a timeout factor of `0.7`), what timeout factor do you think I should use to test, and for how many times? At the moment I am awaiting jtreg 7.6, I therefore guess the timeout factor change to `1` will happen after the fork. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865784064 From cstein at openjdk.org Fri May 9 09:15:51 2025 From: cstein at openjdk.org (Christian Stein) Date: Fri, 9 May 2025 09:15:51 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: <3-AQ92Sgr9tDJIWDO5OX43uBDLndiDN_3jyRj5t2z6Q=.af7cef0d-c447-4401-b4e0-e11a9bdba35b@github.com> References: <0AKC-4omm-W24u11ifhGm8Do8_5sqwPMJz6q3A71FNE=.f4209d38-51d6-4eda-b11e-d670e5ee5575@github.com> <3-AQ92Sgr9tDJIWDO5OX43uBDLndiDN_3jyRj5t2z6Q=.af7cef0d-c447-4401-b4e0-e11a9bdba35b@github.com> Message-ID: On Fri, 9 May 2025 09:09:34 GMT, Leo Korinth wrote: > At the moment I am awaiting jtreg 7.6, I therefore guess the timeout factor change to 1 will happen after the fork. Note, that I moved the timeout configuration feature to `jtreg` 7.5.2 - which will be released soon. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865793769 From lkorinth at openjdk.org Fri May 9 09:15:52 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 09:15:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: <1Sa3h-gkyVwOLDdj_wJdFohAGYbhYhbAYIaqHCmW7oY=.3b58c23d-cf55-421b-aeec-e149809826f2@github.com> References: <1Sa3h-gkyVwOLDdj_wJdFohAGYbhYhbAYIaqHCmW7oY=.3b58c23d-cf55-421b-aeec-e149809826f2@github.com> Message-ID: On Fri, 9 May 2025 08:45:48 GMT, Stefan Karlsson wrote: > > My change of timeout factor to 0.7 is only temporal, as I said: it will be reverted to 4 before integration. > > Is really worth reverting back to 4 instead of trying to get jtreg released with CODETOOLS-7903961 and then integrate this change with a timeout factor of 1? I think it is worth doing in two steps. First I do not want to postpone these limited changes, and second if I would have problems with JDK-8260555, it would be nice if those changes would be smaller and easier to revert. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865794195 From stefank at openjdk.org Fri May 9 09:19:51 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 9 May 2025 09:19:51 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: <1Sa3h-gkyVwOLDdj_wJdFohAGYbhYhbAYIaqHCmW7oY=.3b58c23d-cf55-421b-aeec-e149809826f2@github.com> Message-ID: On Fri, 9 May 2025 09:13:41 GMT, Leo Korinth wrote: > > > My change of timeout factor to 0.7 is only temporal, as I said: it will be reverted to 4 before integration. > > > > > > Is really worth reverting back to 4 instead of trying to get jtreg released with CODETOOLS-7903961 and then integrate this change with a timeout factor of 1? > > I think it is worth doing in two steps. First I do not want to postpone these limited changes, and second if I would have problems with JDK-8260555, it would be nice if those changes would be smaller and easier to revert. I understand the risk of being blocked on JDK-8260555, but frankly, if someone wants to block the change from 4 to 1 I'm not sure this PR is worth doing. We have enough groups represented in this PR so let's ask here: is anyone is opposing the idea of JDK-8260555? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865804474 From dfuchs at openjdk.org Fri May 9 09:33:52 2025 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Fri, 9 May 2025 09:33:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Fri, 9 May 2025 08:40:44 GMT, Leo Korinth wrote: > The whole idea of running with a timeout factor of `0.7` is to remove intermittent failures. (I had it close to 0.5 or maybe less to begin with until I found and reported CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE) Yes - I understand. My point is that there are probably more tests that will need an extended timeout than those that you have already modified. And we want to find out which before the actual change from 4 to 1.0. IMO if a test fails intermittently with 0.7, it's a good indication that it might continue failling intermittently with 1.0, and therefore it should be updated in this PR too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2865849069 From shade at openjdk.org Fri May 9 11:14:07 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 9 May 2025 11:14:07 GMT Subject: RFR: 8356631: OopHandle replacement methods should not be called on empty handles Message-ID: I noticed that in OopHandle/WeakHandle we have {replace,xchg,cmpxchg} methods that overwrite the handle. This is only safe to do when the handle is not empty -- i.e. when there is a storage allocated for it in relevant OopStorage. Otherwise we attempt the store to nullptr, and get a SEGV. Only OopHandle::replace does the assertion for this. We need to add these asserts everywhere else. ------------- Commit messages: - Fix - Basic fix Changes: https://git.openjdk.org/jdk/pull/25139/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25139&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356631 Stats: 4 lines in 2 files changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25139.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25139/head:pull/25139 PR: https://git.openjdk.org/jdk/pull/25139 From jsjolen at openjdk.org Fri May 9 11:32:47 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 11:32:47 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v3] In-Reply-To: References: Message-ID: > Hi, > > I'd like to integrate this simplification of the code for this loop. > > We used to have: > > ```c++ > if (start < 0) { > for (int pseudo_index = -4; pseudo_index < 0; pseudo_index++) { > if (start == pseudo_index) { > if (start >= end || 0 > pos || pos >= buf->length()) break; > // ... > } > start++; > } > } > > > That's exactly the same as: > > > int min_end = MIN2(0, end); > while (-4 <= start && start < min_end) { > if (pos >= buf->length()) break; > // ... > start++; > } > > > but the latter looks like a conventional loop. > > I'd consider this a basic cleanup, which is worth doing in the name of maintainability. > > I would have liked to change the `-4` to `-1` into actual names, but I've no clue where those come from. It doesn't seem worth it to change them if they just happen to be a kludge relying on internal details, or something like that. > > Testing: GHA Tier1 Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Delete unused code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24825/files - new: https://git.openjdk.org/jdk/pull/24825/files/e3a6f53a..193650cf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24825&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24825&range=01-02 Stats: 51 lines in 1 file changed: 0 ins; 50 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24825.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24825/head:pull/24825 PR: https://git.openjdk.org/jdk/pull/24825 From jsjolen at openjdk.org Fri May 9 11:32:48 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 11:32:48 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v2] In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:24:10 GMT, Johan Sj?len wrote: >> Hi, >> >> I'd like to integrate this simplification of the code for this loop. >> >> We used to have: >> >> ```c++ >> if (start < 0) { >> for (int pseudo_index = -4; pseudo_index < 0; pseudo_index++) { >> if (start == pseudo_index) { >> if (start >= end || 0 > pos || pos >= buf->length()) break; >> // ... >> } >> start++; >> } >> } >> >> >> That's exactly the same as: >> >> >> int min_end = MIN2(0, end); >> while (-4 <= start && start < min_end) { >> if (pos >= buf->length()) break; >> // ... >> start++; >> } >> >> >> but the latter looks like a conventional loop. >> >> I'd consider this a basic cleanup, which is worth doing in the name of maintainability. >> >> I would have liked to change the `-4` to `-1` into actual names, but I've no clue where those come from. It doesn't seem worth it to change them if they just happen to be a kludge relying on internal details, or something like that. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove whitespace This certainly seems dead. Tier1-3 passes, all green. I'm deleting this code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24825#issuecomment-2866197261 From coleenp at openjdk.org Fri May 9 11:42:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 11:42:54 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v2] In-Reply-To: References: Message-ID: On Fri, 9 May 2025 08:29:30 GMT, Johan Sj?len wrote: >> The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Just do the obvious thing While you're here, resolution_failed should have a named bit as well as has_appendix. I wonder if you can use C++ bit syntax too since SA doesn't read these flags. ------------- PR Review: https://git.openjdk.org/jdk/pull/25092#pullrequestreview-2827990742 From coleenp at openjdk.org Fri May 9 11:47:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 11:47:54 GMT Subject: RFR: 8355481: Clean up MHN_copyOutBootstrapArguments [v3] In-Reply-To: References: Message-ID: On Fri, 9 May 2025 11:32:47 GMT, Johan Sj?len wrote: >> Hi, >> >> I'd like to integrate this simplification of the code for this loop. >> >> We used to have: >> >> ```c++ >> if (start < 0) { >> for (int pseudo_index = -4; pseudo_index < 0; pseudo_index++) { >> if (start == pseudo_index) { >> if (start >= end || 0 > pos || pos >= buf->length()) break; >> // ... >> } >> start++; >> } >> } >> >> >> That's exactly the same as: >> >> >> int min_end = MIN2(0, end); >> while (-4 <= start && start < min_end) { >> if (pos >= buf->length()) break; >> // ... >> start++; >> } >> >> >> but the latter looks like a conventional loop. >> >> I'd consider this a basic cleanup, which is worth doing in the name of maintainability. >> >> I would have liked to change the `-4` to `-1` into actual names, but I've no clue where those come from. It doesn't seem worth it to change them if they just happen to be a kludge relying on internal details, or something like that. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Delete unused code src/hotspot/share/prims/methodHandles.cpp line 1260: > 1258: THROW_MSG(vmSymbols::java_lang_InternalError(), "bad index info (1)"); > 1259: } > 1260: objArrayHandle buf(THREAD, (objArrayOop) JNIHandles::resolve(buf_jh)); Do you want to assert about the value of 'start' and 'end'? This became a nice cleanup. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24825#discussion_r2081500928 From jsjolen at openjdk.org Fri May 9 11:55:50 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 11:55:50 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v2] In-Reply-To: References: Message-ID: On Fri, 9 May 2025 11:39:52 GMT, Coleen Phillimore wrote: > While you're here, resolution_failed should have a named bit as well as has_appendix. I wonder if you can use C++ bit syntax too since SA doesn't read these flags. As the interpreter reads the `_flag` through `flag_offset`, let's not do this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25092#issuecomment-2866259090 From jsjolen at openjdk.org Fri May 9 12:13:52 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 12:13:52 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v2] In-Reply-To: References: Message-ID: On Fri, 9 May 2025 05:43:19 GMT, David Holmes wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Just do the obvious thing > > src/hotspot/share/oops/resolvedIndyEntry.hpp line 120: > >> 118: u1 old_flags = _flags & ~(1 << has_appendix_shift); >> 119: // Preserve the unaffected bits >> 120: _flags = old_flags | new_flags; > > I may be having a mental blank this late in the week, but why do we need to do anything other than: > > _flags |= new_flags; > > ? `new_flags` should at most have one bit set (the appendix bit) and OR'ing with zero preserves all other bits. Oooh nah, I just realised. @dholmes-ora , we must have John's version because if we're setting it to `0` and the `_flags` already is `1` then it won't do anything. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25092#discussion_r2081539665 From jsjolen at openjdk.org Fri May 9 12:19:33 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 12:19:33 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v3] In-Reply-To: References: Message-ID: > The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - We missed the case of has_appendix = false - And make the code uniform for both flags ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25092/files - new: https://git.openjdk.org/jdk/pull/25092/files/beccef03..4da8d47b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=01-02 Stats: 7 lines in 1 file changed: 3 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25092/head:pull/25092 PR: https://git.openjdk.org/jdk/pull/25092 From dholmes at openjdk.org Fri May 9 12:21:54 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 May 2025 12:21:54 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: <3CKLh1TDhqMNxlWyINFVMAI6MGe_s2rJrgnfzXYpx2M=.ab9a5cb5-9671-4b90-ba81-83f65b82cd6d@github.com> On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* I saw this PR as preparation for the change of the timeout factor so they could be reviewed distinctly and then integrated close together. So it is natural that people are querying how the proposed changes will work with that change - in particular if it will require explicit timeouts to be added to a lot of tests that don't presently have them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2866338479 From coleenp at openjdk.org Fri May 9 12:22:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 12:22:52 GMT Subject: RFR: 8352075: Perf regression accessing fields [v4] In-Reply-To: <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <79Pko1ZqYtuWaLO_NaMrTegVy7b1G6Ao0PZ48qZluoE=.adeaceae-d2f1-4b35-8f9f-a450919a37bb@github.com> Message-ID: On Mon, 5 May 2025 06:51:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Move constant to static final var I'm still working on understanding this change and figuring out why I'm getting a crash that reports # assert(compare_symbols(fields->adr_at(i - 1)->name(constants), fields->adr_at(i)->name(constants)) < 0) failed: Fields should be sorted I can't reproduce this crash locally yet. Also trying to find how it reports the declaration order of fields for JVMTI. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2866338427 From dholmes at openjdk.org Fri May 9 12:25:52 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 May 2025 12:25:52 GMT Subject: RFR: 8354969: Add strdup function for ResourceArea In-Reply-To: References: <7RWd7cVKqTbDkFyVdiLyHLFIUAwiSOMipKzGny-QRH8=.5c55c562-3742-4dd8-9131-73c5140cdf86@github.com> Message-ID: On Fri, 9 May 2025 02:23:28 GMT, Ioi Lam wrote: >> Do we have any immediate candidate uses for this? Whilst in the past Hotspot code had a number of utility API's, these days there is a tendency to delete any unused code. So this really needs to have an imminent usage (@iklam ?) . >> >> Code change itself looks fine. Thanks > >> Do we have any immediate candidate uses for this? Whilst in the past Hotspot code had a number of utility API's, these days there is a tendency to delete any unused code. So this really needs to have an imminent usage (@iklam ?) . >> >> Code change itself looks fine. Thanks > > I did this and found a few places, but there could be more > > > find . -name *pp | xargs grep -l strcpy | \ > xargs egrep -l '(RESOURCE.*char)|(resource_allocate_bytes)' | \ > xargs grep strcpy > > > https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/prims/jvmtiEnvBase.cpp#L471-L472 > https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/classfile/classLoader.cpp#L1513-L1514 > https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/classfile/modules.cpp#L642-L643 Thanks @iklam ! So should we convert those places as part of this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24998#issuecomment-2866350008 From sspitsyn at openjdk.org Fri May 9 12:26:58 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 9 May 2025 12:26:58 GMT Subject: Integrated: 8356251: Need minor cleanup for interp_only_mode In-Reply-To: References: Message-ID: On Tue, 6 May 2025 08:29:36 GMT, Serguei Spitsyn wrote: > This is a minor cleanup for the JVMTI `interp_only_mode` implementation which includes the following changes: > - The `interp_only_mode` in `JavaThread` is represented with a counter which is incremented and decremented. This is confusing because this value should only take values `0` or `1`. Asserts are placed to make sure it is never going out of bounds. The `interp_only_mode` in a `JavaThread` is checked by the interpreter chunks which expect it to be an `integer`. This cleanup has no intention to make it a boolean. > - The function `JvmtiThreadState::process_pending_interp_only()` does a sync on the `JvmtiThreadState_lock` which is not really needed and is being removed. It is called in a `VTMS` transition and so, can not clash with the `SetEventNotificationMode` because it sets a `JvmtiVTMSTransitionDisabler`. > > Testing: > - TBD: Mach5 tiers 1-6 This pull request has now been integrated. Changeset: 411a63ea Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/411a63ea1b0c6e8bfea219427bf1c317c5dadabf Stats: 20 lines in 4 files changed: 3 ins; 8 del; 9 mod 8356251: Need minor cleanup for interp_only_mode Reviewed-by: lmesnik, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/25060 From jsjolen at openjdk.org Fri May 9 12:37:34 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 12:37:34 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v4] In-Reply-To: References: Message-ID: > The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: I should really stop pushing w/o first building ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25092/files - new: https://git.openjdk.org/jdk/pull/25092/files/4da8d47b..56e43e5a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25092&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25092/head:pull/25092 PR: https://git.openjdk.org/jdk/pull/25092 From coleenp at openjdk.org Fri May 9 12:37:34 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 12:37:34 GMT Subject: RFR: 8356390: Rename ResolvedIndyEntry::set_flags to set_has_appendix [v3] In-Reply-To: References: Message-ID: On Fri, 9 May 2025 12:19:33 GMT, Johan Sj?len wrote: >> The `set_flags` function really only sets whether it has an appendix or not, and there's a separate `set_resolution_failed` method just below that also alters the flag. Just rename this to `set_has_appendix` > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - We missed the case of has_appendix = false > - And make the code uniform for both flags Too bad it can't use c++ bitfield syntax. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25092#pullrequestreview-2828144015 From lkorinth at openjdk.org Fri May 9 12:51:52 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 12:51:52 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* Every time I rerun the tests, some new testcase fails on timeouts. Even worse is that I get quite a few /totally unrelated/ test failures every time because I am running 8 tiers. I could probably change the timeout factor to 0.6 to weed out some more tests faster, but I can not use a timeout factor of 0.5 or less because of CODETOOLS-7903937. Every 0.1 percent corresponds to 12 seconds. So in absolute measurements I have a margin of 36 seconds now if I would go to a timeout factor of 1. I am also a bit afraid of having too big of a margin --- it will force me to convert possible huge amount of new test cases. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2866421573 From jsjolen at openjdk.org Fri May 9 12:59:51 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 12:59:51 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Fri, 9 May 2025 06:16:31 GMT, Thomas Stuefe wrote: >> src/hotspot/share/memory/arena.cpp line 47: >> >>> 45: void Arena::initialize_chunk_pool() { >>> 46: _global_chunk_pool_mutex = new PlatformMutex(); >>> 47: } >> >> Possibly a candidate for @jdksjolen 's `Deferred`? > > Why do we even need dynamic initialization here? I thought that is the tradeoff with PlatformMutex: you forego deadlock checks etc, but you gain safety wrt initialisation. > That would also save some instructions. You can use `Deferred`, but as `PlatformMutex` is at "the bottom" of the initialization order it's not necessary, as Thomas notes. I'd be happy to see it use `Deferred`, as it can help with debugging if PlatformMutex ever changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081615044 From lkorinth at openjdk.org Fri May 9 13:06:01 2025 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 9 May 2025 13:06:01 GMT Subject: RFR: 8356171: Increase timeout for testcases as preparation for change of default timeout factor In-Reply-To: References: Message-ID: On Thu, 8 May 2025 14:51:24 GMT, Leo Korinth wrote: > This change tries to add timeout to individual testcases so that I am able to run them with a timeout factor of 1 in the future (JDK-8260555). > > The first commit changes the timeout factor to 0.7, so that I can run tests and test the change (it will finally be changed to 1.0 in JDK-8260555). The next commit excludes some junit/testng tests where I can currently not change the timeout factor (CODETOOLS-7903961). Both these commits will be reverted before integrating the change. I will also apply copyright updates after the review. > > In addition to changing the timeout factor, I am also using a library call to parse the timeout factor from the java properties (I can not use the library function everywhere as jtreg does not allow me to add @library notations to non testcase files). > > My approach has been to run all test, and afterwards updating those that fails due to a timeout factor. The amount of updated testcases is huge, and my strategy has been to quadruple the timeout if I could not directly see that less was needed (thus the timeout will be the same after JDK-8260555 is implemented). In a few places I have added a bit more timeout so that it will work with the 0.7 timeout factor. > > These fixes have been created when I have plown through testcases: > JDK-8352719: Add an equals sign to the modules statement > JDK-8352709: Remove bad timing annotations from WhileOpTest.java > JDK-8352074: Test MemoryLeak.java seems not to test what it is supposed to test > CODETOOLS-7903937: JTREG uses timeout factor on socket timeout but not on KEEPALIVE > CODETOOLS-7903961: Make default timeout configurable > > Sometime in the future I will also fix: > 8260555: Change the default TIMEOUT_FACTOR from 4 to 1 > > for which I am awaiting: > CODETOOLS-7903961 that is fixed in jtreg 7.6, but we are still running 7.5.1+1 > > *After the review I will revert the two first commits, and update the copyrights* Changing the timeout factor might be preferable to do after the fork. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25122#issuecomment-2866463008 From jsjolen at openjdk.org Fri May 9 13:13:51 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 May 2025 13:13:51 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Fri, 9 May 2025 06:49:21 GMT, Thomas Stuefe wrote: >However, I would love to just get rid of arena memory accounting altogether; the solution is very hackish and probably not needed anymore. The only heavy user of chunks causing repeated problems that caused me to use NMT at customers was the JIT. And I added the compilation memory statistic last year, so we have not a much much better tool to investigate JIT arena usage. This has been our experience re: arenas as well. We would still get 'regular' malloc accounting, as the chunks are malloc allocated anyway. I'm OK with this as a middle step. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25072#issuecomment-2866484798 From zgu at openjdk.org Fri May 9 13:22:52 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Fri, 9 May 2025 13:22:52 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Tue, 6 May 2025 20:32:05 GMT, Coleen Phillimore wrote: > Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. > Tested with tier1-4, and tier1 on all Oracle-supported OSs. src/hotspot/share/nmt/mallocTracker.cpp line 68: > 66: // copy is going on, because their size is adjusted using this > 67: // buffer in make_adjustment(). > 68: ChunkPoolLocker lock; At the time NMT was written, `ThreadCritical` was the only native lock. I wonder it is the time NMT gets its own lock? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081656758 From ihse at openjdk.org Fri May 9 14:20:31 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 9 May 2025 14:20:31 GMT Subject: RFR: 8356644: Update encoding declaration to UTF-8 Message-ID: <8loaLnxoQ6Om5EqhX9_nORypM5UjgVz3DYJnMinZ77w=.bd323a79-0fd6-4b16-8edb-193fec7fbb13@github.com> A handful of html and xml files in the JDK source tree claims to have encodings like `ISO-8859-1`, when they are in fact pure US-ASCII files. While perhaps technically correct, this is misleading, and goes contrary to the efforts of turning the source code into UTF-8 proper. I chose between marking them as "ASCII" and "UTF-8", but chose the latter, since otherwise if they ever were to be updated with a non-ASCII character, the value would have been unspecified, and after JDK-8301971, all files in the JDK repository will be interpreted as UTF-8. ------------- Commit messages: - 8356644: Update encoding declaration to UTF-8 Changes: https://git.openjdk.org/jdk/pull/25148/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25148&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356644 Stats: 794 lines in 53 files changed: 2 ins; 9 del; 783 mod Patch: https://git.openjdk.org/jdk/pull/25148.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25148/head:pull/25148 PR: https://git.openjdk.org/jdk/pull/25148 From coleenp at openjdk.org Fri May 9 14:52:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 14:52:53 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: <4w1JHh4F7BBmt8HLa9aflsY6QmOtzl3LWL_yNjJ60f4=.0f41d06b-2f81-44ea-b200-28cf2e6c623d@github.com> On Fri, 9 May 2025 05:59:14 GMT, Thomas Stuefe wrote: >> src/hotspot/share/runtime/threads.cpp line 463: >> >>> 461: >>> 462: // Initialize memory pools >>> 463: Arena::initialize_chunk_pool(); >> >> What code first uses a `ChunkPool` or the `ChunkPoolLocker`? (It isn't obvious at what point the NMT code may execute during early initialization.) > > Any Arena that is created allocates an initial chunk. Arenas are created during (any) Thread creation, for compiler initialization and in universe::init, among others. So this is needed early. > > But as I wrote above, I don't think we need dynamic initialization at all? Yes, this is needed early. We create the ResourceArea arena when creating a JavaThread before the JavaThread is completely created so Thread::current() doesn't exist in this case. There are also other early initializations. There's two things that make us unable to have a static PlatformMutex. 1. on macosx, the PlatformMutex is implemented using an indirection the macosx dll load crashes trying to initialize it. # assert(status == 0) failed: error EINVAL(22), freelist lock V [libjvm.dylib+0xea68c4] PlatformMutex::PlatformMutex()+0xfc V [libjvm.dylib+0x269c34] _GLOBAL__sub_I_arena.cpp+0x38 C [dyld+0x22a24] invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const::$_0::operator()() const+0xa8 C [dyld+0x680f4] invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0xac C [dyld+0x5b668] invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0x1f0 C [dyld+0x22fc] dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const+0x12c C [dyld+0x5a6a0] dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0xc0 C [dyld+0x5d188] dyld3::MachOFile::forEachInitializerPointerSection(Diagnostics&, void (unsigned int, unsigned int, bool&) block_pointer) const+0xa0 C [dyld+0x67de8] dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) Even if I disable the indirect PlatformMutex workaround for the macosx bug JDK-8218975, the underlying pthread_mutex is not initialized before this code uses it. # assert(status == 0) failed: error EINVAL(22), mutex_init So I don't think PlatformMutex can be a static variable, which is why I initialize it early in vm startup. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081848209 From coleenp at openjdk.org Fri May 9 14:52:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 14:52:54 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: <4w1JHh4F7BBmt8HLa9aflsY6QmOtzl3LWL_yNjJ60f4=.0f41d06b-2f81-44ea-b200-28cf2e6c623d@github.com> References: <4w1JHh4F7BBmt8HLa9aflsY6QmOtzl3LWL_yNjJ60f4=.0f41d06b-2f81-44ea-b200-28cf2e6c623d@github.com> Message-ID: On Fri, 9 May 2025 14:48:36 GMT, Coleen Phillimore wrote: >> Any Arena that is created allocates an initial chunk. Arenas are created during (any) Thread creation, for compiler initialization and in universe::init, among others. So this is needed early. >> >> But as I wrote above, I don't think we need dynamic initialization at all? > > Yes, this is needed early. We create the ResourceArea arena when creating a JavaThread before the JavaThread is completely created so Thread::current() doesn't exist in this case. There are also other early initializations. > > There's two things that make us unable to have a static PlatformMutex. 1. on macosx, the PlatformMutex is implemented using an indirection the macosx dll load crashes trying to initialize it. > > > # assert(status == 0) failed: error EINVAL(22), freelist lock > > V [libjvm.dylib+0xea68c4] PlatformMutex::PlatformMutex()+0xfc > V [libjvm.dylib+0x269c34] _GLOBAL__sub_I_arena.cpp+0x38 > C [dyld+0x22a24] invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const::$_0::operator()() const+0xa8 > C [dyld+0x680f4] invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const+0xac > C [dyld+0x5b668] invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0x1f0 > C [dyld+0x22fc] dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const+0x12c > C [dyld+0x5a6a0] dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const+0xc0 > C [dyld+0x5d188] dyld3::MachOFile::forEachInitializerPointerSection(Diagnostics&, void (unsigned int, unsigned int, bool&) block_pointer) const+0xa0 > C [dyld+0x67de8] dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) > > > Even if I disable the indirect PlatformMutex workaround for the macosx bug JDK-8218975, the underlying pthread_mutex is not initialized before this code uses it. > > > # assert(status == 0) failed: error EINVAL(22), mutex_init > > > So I don't think PlatformMutex can be a static variable, which is why I initialize it early in vm startup. I added the initialization of the mutex after os::init(). Assuming that pthread_mutex will be initialized at that point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081850721 From coleenp at openjdk.org Fri May 9 14:58:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 14:58:55 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: <3jdKyW-B92C6KWvibjaJeyipE4SIeOpQqhhhfyORTS0=.28d45963-7a8a-4df0-8b17-cc800d476b3c@github.com> On Fri, 9 May 2025 06:11:11 GMT, Thomas Stuefe wrote: >> Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. >> Tested with tier1-4, and tier1 on all Oracle-supported OSs. > > src/hotspot/share/nmt/nmtUsage.cpp line 58: > >> 56: // is deducted from mtChunk in the end to give correct values. >> 57: ChunkPoolLocker lock; >> 58: const MallocMemorySnapshot* ms = MallocMemorySummary::as_snapshot(); > > I'd scope this to the as_snapshot call, the subsequent code does not need the locking (lesser chance of accidental recursive locking) done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081861429 From coleenp at openjdk.org Fri May 9 14:58:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 14:58:52 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Fri, 9 May 2025 12:57:35 GMT, Johan Sj?len wrote: >> Why do we even need dynamic initialization here? I thought that is the tradeoff with PlatformMutex: you forego deadlock checks etc, but you gain safety wrt initialisation. >> That would also save some instructions. > > You can use `Deferred`, but as `PlatformMutex` is at "the bottom" of the initialization order it's not necessary, as Thomas notes. I'd be happy to see it use `Deferred`, as it can help with debugging if PlatformMutex ever changes. Not really crazy about the Deferred<> name and that change is still under consideration. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081863674 From coleenp at openjdk.org Fri May 9 14:58:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 14:58:54 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Fri, 9 May 2025 13:20:06 GMT, Zhengyu Gu wrote: >> Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. >> Tested with tier1-4, and tier1 on all Oracle-supported OSs. > > src/hotspot/share/nmt/mallocTracker.cpp line 68: > >> 66: // copy is going on, because their size is adjusted using this >> 67: // buffer in make_adjustment(). >> 68: ChunkPoolLocker lock; > > At the time NMT was written, `ThreadCritical` was the only native lock. I wonder it is the time NMT gets its own lock? Yes, ThreadCritical preceded the PlatformMutex locks, which were added later. We had ThreadCritical (iirc) for a limitation in the Windows malloc code, which is why it was around all the os::malloc calls once. NMT just built upon that since it was there. We cannot use regular Mutex because Mutex requires Thread::current() which doesn't exist at this point. The NMT code might be able to use a lock around the code that deletes arenas, as the comments suggest that it is trying to protect. This could be a regular Mutex because the arena code is deleted individually and purged well after startup time. I left this as a shared lock, but it could be further improved to use a nmt-delete-lock Mutex. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081860936 From coleenp at openjdk.org Fri May 9 15:02:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 9 May 2025 15:02:55 GMT Subject: RFR: 8356173: Remove ThreadCritical In-Reply-To: References: Message-ID: On Fri, 9 May 2025 04:59:51 GMT, David Holmes wrote: >> Updated the description in the bug. This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock. >> Tested with tier1-4, and tier1 on all Oracle-supported OSs. > > src/hotspot/share/memory/arena.cpp line 219: > >> 217: pool->return_to_pool(c); >> 218: } else { >> 219: // Free chunks under NMT lock so that NMT adjustment is stable. > > NMT lock??? This could use an NMT lock instead (had a version with this once). I fixed the comment to reflect this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2081866093 From mhaessig at openjdk.org Fri May 9 15:22:57 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SO+/vXNzaWc=?=) Date: Fri, 9 May 2025 15:22:57 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock [v2] In-Reply-To: <0W1NJ3CRAdeugnJTYVGVtomqYJEX5QdVEua9XPSWn5g=.d5b30054-2805-4b8c-a9f9-5b1cdbc12d2a@github.com> References: <0W1NJ3CRAdeugnJTYVGVtomqYJEX5QdVEua9XPSWn5g=.d5b30054-2805-4b8c-a9f9-5b1cdbc12d2a@github.com> Message-ID: <82PkAUAxT8KzDxLPJRzpexMxjpCIHkEhId71UfTojIw=.daba032d-cb7c-4220-ab22-53162487a0a2@github.com> On Thu, 8 May 2025 14:54:05 GMT, Emanuel Peter wrote: > For the test: It's a bit of a shame to have lots of separate files. I got myself confused with class visibilities. I moved the `A` and `B` into the java test file and moved both of them one directory lower into `compiler/interpreter`. That makes it a lot cleaner. > Or else argue why it CANNOT be done. I looked into it some more and there are two places where we deopt and move the `bci` over to the next bytecode: when an object of an unloaded class is returned by `getstatic` (see code in the PR description) and calls (see below) and the object reference is `null`. https://github.com/openjdk/jdk/blob/411a63ea1b0c6e8bfea219427bf1c317c5dadabf/src/hotspot/share/opto/doCall.cpp#L770-L785 Since `{d,f,i,l}return`, `{table,lookup}switch`, and `ret` require an integer on the stack but only bytecodes that push references on the stack can deopt to the next `bci`, we cannot trigger this error for those bytecodes. Now, that begs the question, whether these bytecodes should then be in `falls_through()`. I argue that they should be, since that would be the correct behavior if we deopted at such a bytecode. > Did you go through all bytecodes we support here? I did, but I missed `lookupswitch` and `tableswitch`. `jsr` just pushes the address of the next bytecode onto the stack. `ret` can jump to such an address, but I already added that to `falls_through()`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25118#issuecomment-2866965280 From mhaessig at openjdk.org Fri May 9 15:22:56 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SO+/vXNzaWc=?=) Date: Fri, 9 May 2025 15:22:56 GMT Subject: RFR: 8336906: C2: assert(bb->is_reachable()) failed: getting result from unreachable basicblock [v2] In-Reply-To: References: Message-ID: > # Issue Summary > > This PR addresses an `assert(bb->is_reachable())` that is triggered in the code for `-XX:+VerifyStack` after a deoptimization with reason `null_assert_or_unreached0` at a `getstatic` bytecode. Following the `getstatic` is an `areturn` and then an unreachable bytecode. When the code for `VerifyStack` tries to compute an oop map for the basic block of the unreachable bytecode, the assert triggers: > > getstatic Field A.val:"LB"; // if class B is not loaded, C2 deopts with reason "null_assert_or_unreached0" > areturn; > // The following is unreachable > iconst_0; > > > This is a similar problem to [JDK-8271055](https://bugs.openjdk.org/browse/JDK-8271055) (#7331), but this particular deopt with reason `null_assert_or_unreached0` at `getstatic` of a field containing an object reference [deopts at the next bytecode](https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/opto/parse3.cpp#L176-L199). The aforementioned issue introduced a check to skip stack verification of the next bytecode in the code if the execution after the deopted bytecode does not continue at the next bytecode in the code, i.e. falls through to the next bytecode. Unfortunately, this check did not include `areturn` as a bytecode that does not fall-through: > https://github.com/openjdk/jdk/blob/ad07426fab3396caefd7c08d924e085c1f6f61ba/src/hotspot/share/runtime/deoptimization.cpp#L845-L856 > > # Change Summary > > To fix the immediate issue described above, this PR adds `areturn` to the list of bytecodes that does not fall through. However, all return bytecodes exhibit the same behavior and might be susceptible to a similar issue. Even though I was not able to reproduce the same crash with `{d,f,i,l}return` because I could not get those or the preceding bytecode to deopt, I also added them to the `falls_through()` function. For the remaining bytecodes in `falls_through()` with the exception of `athrow` I wrote a regression test. > > # Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/14595928439) > - [x] tier1 through tier3 on Oracle supported platforms and OSs plus Oracle internal testing > > # Acknowledgements > Special thanks to @eme64 for his hard work on reducing a reproducer that works on all platforms. Manuel H?ssig has updated the pull request incrementally with three additional commits since the last revision: - Add lookupswitch and tableswitch - Elaborate why we need that class file version - Reorganized tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25118/files - new: https://git.openjdk.org/jdk/pull/25118/files/53cec97e..564d2fca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25118&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25118&range=00-01 Stats: 443 lines in 7 files changed: 198 ins; 245 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25118.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25118/head:pull/25118 PR: https://git.openjdk.org/jdk/pull/25118 From iklam at openjdk.org Fri May 9 15:35:52 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 9 May 2025 15:35:52 GMT Subject: RFR: 8354969: Add strdup function for ResourceArea In-Reply-To: References: <7RWd7cVKqTbDkFyVdiLyHLFIUAwiSOMipKzGny-QRH8=.5c55c562-3742-4dd8-9131-73c5140cdf86@github.com> Message-ID: <_CtvpxxLwQUAQ0r8yS4htgUeZ24IPHwuI4QUazcV9Yc=.661c04c9-5480-48f8-aec0-d3a66ba38ed5@github.com> On Fri, 9 May 2025 02:23:28 GMT, Ioi Lam wrote: >> Do we have any immediate candidate uses for this? Whilst in the past Hotspot code had a number of utility API's, these days there is a tendency to delete any unused code. So this really needs to have an imminent usage (@iklam ?) . >> >> Code change itself looks fine. Thanks > >> Do we have any immediate candidate uses for this? Whilst in the past Hotspot code had a number of utility API's, these days there is a tendency to delete any unused code. So this really needs to have an imminent usage (@iklam ?) . >> >> Code change itself looks fine. Thanks > > I did this and found a few places, but there could be more > > > find . -name *pp | xargs grep -l strcpy | \ > xargs egrep -l '(RESOURCE.*char)|(resource_allocate_bytes)' | \ > xargs grep strcpy > > > https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/prims/jvmtiEnvBase.cpp#L471-L472 > https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/classfile/classLoader.cpp#L1513-L1514 > https://github.com/openjdk/jdk/blob/9a0e6f338f34fb5da16d5f9eb710cdddd4302945/src/hotspot/share/classfile/modules.cpp#L642-L643 > Thanks @iklam ! So should we convert those places as part of this PR? I think we should do that. @toxaart could you see if there are other cases that can also be converted? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24998#issuecomment-2867012436 From mark.reinhold at oracle.com Fri May 9 15:49:40 2025 From: mark.reinhold at oracle.com (Mark Reinhold) Date: Fri, 9 May 2025 15:49:40 +0000 Subject: New candidate JEP: 516: Ahead-of-Time Object Caching with Any GC Message-ID: <20250509154939.9B501814BE6@eggemoggin.niobe.net> https://openjdk.org/jeps/516 Summary: Enhance the ahead-of-time cache, which enables the HotSpot Java Virtual Machine to improve startup and warmup time, so that it can be used with any garbage collector, including the low-latency Z Garbage Collector (ZGC). Achieve this by making it possible to load cached Java objects sequentially into memory from a neutral, GC-agnostic format, rather than map them directly into memory in a GC-specific format. - Mark From kbarrett at openjdk.org Fri May 9 15:59:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 9 May 2025 15:59:38 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v6] In-Reply-To: References: Message-ID: <5D6vakt8Q41_YF90LaGoxI0tECxo3hm_fiMCuXrpf-w=.363ecf9a-9421-482d-a101-a7ec1efd8b8e@github.com> > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into native-reference-get - use new waitForRefProc, some tidying - Merge branch 'master' into native-reference-get - remove timeout by using waitForReferenceProcessing - make ill-timed gc in non-concurrent case less likely - fix test package use - add package decl to test - parameterized return type of native get0 - test native method - native Reference.get helper ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24315/files - new: https://git.openjdk.org/jdk/pull/24315/files/48b7960c..6b4e4c76 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=04-05 Stats: 6281 lines in 341 files changed: 3313 ins; 1807 del; 1161 mod Patch: https://git.openjdk.org/jdk/pull/24315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24315/head:pull/24315 PR: https://git.openjdk.org/jdk/pull/24315 From shade at openjdk.org Fri May 9 16:02:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 9 May 2025 16:02:54 GMT Subject: RFR: 8356027: Print enhanced compilation timings [v8] In-Reply-To: References: Message-ID: On Wed, 7 May 2025 11:51:04 GMT, Aleksey Shipilev wrote: >> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include: >> 1. Time spent before queuing: shows the compilation queue bottlenecks >> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load >> 3. Time spent actually compiling: shows the per-method compilation costs >> >> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028). >> >> The difference from the output format we ship in Leyden: >> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such. >> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations. >> >> See the sample `-XX:+PrintCompilation` output in the comments. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Do microseconds for timings Well, this PR is essentially paying for our diagnostics debt. We know this logging is useful for both humans, and with [JDK-8356383](https://bugs.openjdk.org/browse/JDK-8356383) also for tools. So we make the output useful for both. In fact, this is coming from Leyden, where the rich compilation logs were the nice diagnostic addition, which we (well, at least myself) used extensively to understand the compiler dynamics. I agree UL is a more convenient vehicle for producing tool-readable files. This is why this PR handles UL as well. That said, `-XX:+PrintCompilation` is still a go-to tool for watching the compiler activity. Hiding good output behind yet another flag feels dubious to me. Adjusting `-XX:+PrintCompilation` output to capture most useful parts also simplifies our logging: we push the same strings to `tty` and `UL`, we do not need `Verbose` (debug only, btw, not accessible in the field) and `PrintCompilation2` flags, etc. So, I remain a believer this is a right and useful thing to do. I also note that -- as practical example -- AFAICS in Leyden this rich diagnostic logging was implemented for `PrintCompilation` while `LogCompilation` was kept intact. Which kinda tells which facility people actually use more :) I can put new stuff in `LogCompilation` output as well in this PR, BTW, if you want. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2867091532 From dfuchs at openjdk.org Fri May 9 16:01:51 2025 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Fri, 9 May 2025 16:01:51 GMT Subject: RFR: 8356644: Update encoding declaration to UTF-8 In-Reply-To: <8loaLnxoQ6Om5EqhX9_nORypM5UjgVz3DYJnMinZ77w=.bd323a79-0fd6-4b16-8edb-193fec7fbb13@github.com> References: <8loaLnxoQ6Om5EqhX9_nORypM5UjgVz3DYJnMinZ77w=.bd323a79-0fd6-4b16-8edb-193fec7fbb13@github.com> Message-ID: On Fri, 9 May 2025 14:14:57 GMT, Magnus Ihse Bursie wrote: > A handful of html and xml files in the JDK source tree claims to have encodings like `ISO-8859-1`, when they are in fact pure US-ASCII files. > > While perhaps technically correct, this is misleading, and goes contrary to the efforts of turning the source code into UTF-8 proper. > > I chose between marking them as "ASCII" and "UTF-8", but chose the latter, since otherwise if they ever were to be updated with a non-ASCII character, the value would have been unspecified, and after JDK-8301971, all files in the JDK repository will be interpreted as UTF-8. Changes to net-properties.html LGTM ------------- PR Review: https://git.openjdk.org/jdk/pull/25148#pullrequestreview-2828873053 From mchevalier at openjdk.org Fri May 9 16:29:03 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Fri, 9 May 2025 16:29:03 GMT Subject: RFR: 8351958: Some compile commands should be made diagnostic Message-ID: Error when using a `CompileCommand` that is an alias for a diagnostic option when `-XX:+UnlockDiagnosticVMOptions` is not provided. The argument processing works this way: 1. Flags are parsed, setting the value accordingly. For `CompileCommand`, each option is added to a `\n`-separated string. At this step, if a flag is diagnostic but `-XX:+UnlockDiagnosticVMOptions` is not provided, then an error message is emitted, argument parsing fails and the VM terminates. Yet, the value of `CompileCommand` is still an unparsed list of string. 2. Eventually, `CompileCommand` is parsed. For some of them, the value of regular flag is used as the default value, and as far as I know, it's the only mapping between `CompileCommand` and the equivalent flag. Moreover, at this point, the order of the various command line arguments is lost: it is not possible to know which `CompileCommand` comes before or after the `-XX:+UnlockDiagnosticVMOptions`. Moreover, `CompileCommand` are parsed in the same way as compiler directives coming from a file. If we complain about diagnostic `CompileCommand`, we should also when coming from a directive file, for consistency. But then, while the relative order of `CompileCommand` and `-XX:+UnlockDiagnosticVMOptions` is lost, it simply makes no sense to compare the ordering of command line arguments and directives from a file. So, before the difficulty and the relative lack of sense, I defaulted to ignore the ordering requirement. And by using the `CompileCommand` error reporting mechanism, we get an error that is consistent with other `CompileCommand`-parsing related errors, e.g. CompileCommand: An error occurred during parsing Error: VM option 'PrintAssembly' is diagnostic and must be enabled via -XX:+UnlockDiagnosticVMOptions. Line: 'PrintAssembly,*::*' Usage: '-XX:CompileCommand=