From dholmes at openjdk.org Tue Jan 2 01:10:48 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 Jan 2024 01:10:48 GMT Subject: RFR: 8322765: Eliminate -Wparentheses warnings in runtime code In-Reply-To: References: Message-ID: <-tBHgM0UgGcY3N-VN6a52Crwe3-FID-Yq884Ck8V39k=.3f57604e-ad74-4611-be43-d85c832ac8f4@github.com> On Fri, 29 Dec 2023 06:27:43 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Looks good and trivial IMO. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17201#pullrequestreview-1799893692 From gcao at openjdk.org Tue Jan 2 01:50:57 2024 From: gcao at openjdk.org (Gui Cao) Date: Tue, 2 Jan 2024 01:50:57 GMT Subject: RFR: 8322583: RISC-V: Enable fast class initialization checks Message-ID: Hi, Please review this small change enabling fast class initialization checks on linux-riscv. As discussed on [1], we noticed that VM_Version::supports_fast_class_init_checks is false on linux-riscv64 platform. But the needed code of this optimization on this platform is there which is supposed to solve a performance issue https://bugs.openjdk.org/browse/JDK-8219233 at that time. I found that this performance issue is still reproduciable on linux-riscv64. And the original performance issue reported by JDK-8219233 is resolved when this optimization is enabled (x2.5 improvement for reported case on linux-riscv64). Once this is in, I'd like to request approval for backporting to JDK 21u and JDK 17u since corresponding fixes for the other ports are already there, unless the community feels otherwise. [1] https://github.com/openjdk/jdk/pull/17006#issuecomment-1865582796 OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns false(default): $ clojure -A:user "Elapsed time: 91597.717028 msecs" $ clojure -A:user "Elapsed time: 91851.01435 msecs" $ clojure -A:user "Elapsed time: 92149.106378 msecs" $ clojure foo.clj "Elapsed time: 35663.36249 msecs" $ clojure foo.clj "Elapsed time: 35677.387338 msecs" $ clojure foo.clj "Elapsed time: 35253.330701 msecs" OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns true: $ clojure -A:user "Elapsed time: 37425.063258 msecs" $ clojure -A:user "Elapsed time: 36338.252422 msecs" $ clojure -A:user "Elapsed time: 37441.73001 msecs" $ clojure foo.clj "Elapsed time: 36018.857489 msecs" $ clojure foo.clj "Elapsed time: 34750.533297 msecs" $ clojure foo.clj "Elapsed time: 35499.890121 msecs" ### Testing: - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (fastdebug) - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (release) - [x] Run tier1-3, hotspot:tier4 tests with SiFive unmatched (release) ------------- Commit messages: - 8322583: RISC-V: Enable fast class initialization checks Changes: https://git.openjdk.org/jdk/pull/17192/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17192&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322583 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17192.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17192/head:pull/17192 PR: https://git.openjdk.org/jdk/pull/17192 From rehn at openjdk.org Tue Jan 2 01:50:57 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 2 Jan 2024 01:50:57 GMT Subject: RFR: 8322583: RISC-V: Enable fast class initialization checks In-Reply-To: References: Message-ID: On Tue, 26 Dec 2023 07:14:22 GMT, Gui Cao wrote: > Hi, Please review this small change enabling fast class initialization checks on linux-riscv. As discussed on [1], we noticed that VM_Version::supports_fast_class_init_checks is false on linux-riscv64 platform. But the needed code of this optimization on this platform is there which is supposed to solve a performance issue https://bugs.openjdk.org/browse/JDK-8219233 at that time. I found that this performance issue is still reproduciable on linux-riscv64. And the original performance issue reported by JDK-8219233 is resolved when this optimization is enabled (x2.5 improvement for reported case on linux-riscv64). > > Once this is in, I'd like to request approval for backporting to JDK 21u and JDK 17u since corresponding fixes for the other ports are already there, unless the community feels otherwise. > > [1] https://github.com/openjdk/jdk/pull/17006#issuecomment-1865582796 > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns false(default): > > > $ clojure -A:user > "Elapsed time: 91597.717028 msecs" > $ clojure -A:user > "Elapsed time: 91851.01435 msecs" > $ clojure -A:user > "Elapsed time: 92149.106378 msecs" > $ clojure foo.clj > "Elapsed time: 35663.36249 msecs" > $ clojure foo.clj > "Elapsed time: 35677.387338 msecs" > $ clojure foo.clj > "Elapsed time: 35253.330701 msecs" > > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns true: > > > $ clojure -A:user > "Elapsed time: 37425.063258 msecs" > $ clojure -A:user > "Elapsed time: 36338.252422 msecs" > $ clojure -A:user > "Elapsed time: 37441.73001 msecs" > $ clojure foo.clj > "Elapsed time: 36018.857489 msecs" > $ clojure foo.clj > "Elapsed time: 34750.533297 msecs" > $ clojure foo.clj > "Elapsed time: 35499.890121 msecs" > > > ### Testing: > > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (fastdebug) > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (release) > - [x] Run tier1-3, hotspot:tier4 tests with SiFive unmatched (release) Thanks! Oh, it was only in draft :) ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17192#pullrequestreview-1797100118 From fyang at openjdk.org Tue Jan 2 02:00:48 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 2 Jan 2024 02:00:48 GMT Subject: RFR: 8322583: RISC-V: Enable fast class initialization checks In-Reply-To: References: Message-ID: On Tue, 26 Dec 2023 07:14:22 GMT, Gui Cao wrote: > Hi, Please review this small change enabling fast class initialization checks on linux-riscv. As discussed on [1], we noticed that VM_Version::supports_fast_class_init_checks is false on linux-riscv64 platform. But the needed code of this optimization on this platform is there which is supposed to solve a performance issue https://bugs.openjdk.org/browse/JDK-8219233 at that time. I found that this performance issue is still reproduciable on linux-riscv64. And the original performance issue reported by JDK-8219233 is resolved when this optimization is enabled (x2.5 improvement for reported case on linux-riscv64). > > Once this is in, I'd like to request approval for backporting to JDK 21u and JDK 17u since corresponding fixes for the other ports are already there, unless the community feels otherwise. > > [1] https://github.com/openjdk/jdk/pull/17006#issuecomment-1865582796 > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns false(default): > > > $ clojure -A:user > "Elapsed time: 91597.717028 msecs" > $ clojure -A:user > "Elapsed time: 91851.01435 msecs" > $ clojure -A:user > "Elapsed time: 92149.106378 msecs" > $ clojure foo.clj > "Elapsed time: 35663.36249 msecs" > $ clojure foo.clj > "Elapsed time: 35677.387338 msecs" > $ clojure foo.clj > "Elapsed time: 35253.330701 msecs" > > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns true: > > > $ clojure -A:user > "Elapsed time: 37425.063258 msecs" > $ clojure -A:user > "Elapsed time: 36338.252422 msecs" > $ clojure -A:user > "Elapsed time: 37441.73001 msecs" > $ clojure foo.clj > "Elapsed time: 36018.857489 msecs" > $ clojure foo.clj > "Elapsed time: 34750.533297 msecs" > $ clojure foo.clj > "Elapsed time: 35499.890121 msecs" > > > ### Testing: > > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (fastdebug) > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (release) > - [x] Run tier1-3, hotspot:tier4 tests with SiFive unmatched (release) Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17192#pullrequestreview-1799903545 From kbarrett at openjdk.org Tue Jan 2 02:55:01 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 02:55:01 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code Message-ID: Please review this change to eliminate some -Wparentheses warnings. This involved simply adding a few parentheses to make some implicit operator precedence explicit. Testing: mach5 tier1 Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses and other changes needed to make that work. ------------- Commit messages: - fix -Wparentheses warnings in aarch64 code Changes: https://git.openjdk.org/jdk/pull/17210/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17210&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322806 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17210.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17210/head:pull/17210 PR: https://git.openjdk.org/jdk/pull/17210 From kbarrett at openjdk.org Tue Jan 2 03:09:14 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 03:09:14 GMT Subject: RFR: 8322765: Eliminate -Wparentheses warnings in runtime code [v2] In-Reply-To: References: Message-ID: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: update copyrights for new year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17201/files - new: https://git.openjdk.org/jdk/pull/17201/files/d690cd67..da5f0bb0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17201&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17201&range=00-01 Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17201/head:pull/17201 PR: https://git.openjdk.org/jdk/pull/17201 From kbarrett at openjdk.org Tue Jan 2 03:09:14 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 03:09:14 GMT Subject: RFR: 8322765: Eliminate -Wparentheses warnings in runtime code [v2] In-Reply-To: <-tBHgM0UgGcY3N-VN6a52Crwe3-FID-Yq884Ck8V39k=.3f57604e-ad74-4611-be43-d85c832ac8f4@github.com> References: <-tBHgM0UgGcY3N-VN6a52Crwe3-FID-Yq884Ck8V39k=.3f57604e-ad74-4611-be43-d85c832ac8f4@github.com> Message-ID: On Tue, 2 Jan 2024 01:07:29 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyrights for new year > > Looks good and trivial IMO. > > Thanks Thanks @dholmes-ora for review. I waffled about suggesting this is a trivial change, and will accept your suggestion that it is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17201#issuecomment-1873586023 From kbarrett at openjdk.org Tue Jan 2 03:09:14 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 03:09:14 GMT Subject: Integrated: 8322765: Eliminate -Wparentheses warnings in runtime code In-Reply-To: References: Message-ID: On Fri, 29 Dec 2023 06:27:43 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. This pull request has now been integrated. Changeset: 7c1d481d Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/7c1d481d6ddeb67118abbdc909884f4793343fee Stats: 14 lines in 5 files changed: 0 ins; 0 del; 14 mod 8322765: Eliminate -Wparentheses warnings in runtime code Reviewed-by: dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17201 From kbarrett at openjdk.org Tue Jan 2 05:53:54 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 05:53:54 GMT Subject: RFR: 8322805: Eliminate -Wparentheses warnings in x86 code Message-ID: Please review this change to eliminate some -Wparentheses warnings. This involved simply adding a few parentheses to make some implicit operator precedence explicit. Testing: mach5 tier1 Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses and other changes needed to make that work. ------------- Commit messages: - fix -Wparentheses warnings in x86 code Changes: https://git.openjdk.org/jdk/pull/17211/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17211&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322805 Stats: 17 lines in 6 files changed: 0 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/17211.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17211/head:pull/17211 PR: https://git.openjdk.org/jdk/pull/17211 From rehn at openjdk.org Tue Jan 2 06:49:47 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 2 Jan 2024 06:49:47 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v3] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 09:57:10 GMT, Robbin Ehn wrote: >> Hi, this is the instructions for zcb. >> >> Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. >> Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. >> I think we need to do some rework here. >> >> I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). >> (macro stuff was originally done when templates where blacklisted in hotspot) >> >> And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). >> >> I have done some modification since it passed tier1, so I'm running stuff over the weekend. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into zcb > - Merge branch 'master' into zcb > - zcb instruction set Passes t1+t2 fastdebug (with some expected timeouts). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1873667517 From fyang at openjdk.org Tue Jan 2 07:08:54 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 2 Jan 2024 07:08:54 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v10] In-Reply-To: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> References: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> Message-ID: On Fri, 22 Dec 2023 14:10:13 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > fixed lmul Hi, Thanks for the update. Having another look :-) src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3792: > 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister v_abef, VectorRegister v_cdgh, > 3791: bool gen_words = true, bool step_const = true) { > 3792: __ vl1reXX_v(vset_sew, vtemp, scalarconst); Shouldn't we use `vleXX_v` to load the constant for each round instead of `vl1reXX_v`? `vl1reXX_v` which delegates work to `vl1re32_v`/vl1re64_v only loads a single vector register and is not aware of the LMUL setting. I see the openssl version is using `vle32_v`/`vle64_v` to load 4 e32/e64 elements of the constants for each round. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3934: > 3932: // > 3933: // e32/e64: vector of 32b/64b/4B/8B elements > 3934: // m1: LMUL=1 This line of comment needs to be updated to reflect the latest changes. ------------- PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1795378708 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1439171039 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435451464 From dholmes at openjdk.org Tue Jan 2 07:18:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 Jan 2024 07:18:37 GMT Subject: RFR: 8322805: Eliminate -Wparentheses warnings in x86 code In-Reply-To: References: Message-ID: <_-u5L3BKwMmolyBWsAZ2QIEe_H6bTZVICJFBytUIIfg=.0e574c9a-c375-44e6-a83b-5f77f52474c5@github.com> On Tue, 2 Jan 2024 05:49:06 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Looks good, and again trivial IMO. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17211#pullrequestreview-1800019347 From kbarrett at openjdk.org Tue Jan 2 08:01:05 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 08:01:05 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code Message-ID: Please review this change to eliminate some -Wparentheses warnings. This involved simply adding a few parentheses to make some implicit operator precedence explicit. Testing: Local (linux-x64) cross-build for linux-riscv with this change plus -Wparentheses enabled and other changes to allow that to work. Requesting someone from the riscv porters to properly test this. ------------- Commit messages: - fix -Wparentheses warnings in riscv code Changes: https://git.openjdk.org/jdk/pull/17216/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17216&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322817 Stats: 9 lines in 2 files changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/17216.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17216/head:pull/17216 PR: https://git.openjdk.org/jdk/pull/17216 From gcao at openjdk.org Tue Jan 2 08:43:47 2024 From: gcao at openjdk.org (Gui Cao) Date: Tue, 2 Jan 2024 08:43:47 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code In-Reply-To: References: Message-ID: <2ScScXJLJ2Jr8GoRR01rnQXb7nMIDR7w5GKGmf2DPgI=.eec42a0e-21af-4381-8848-8d20d290b9c7@github.com> On Tue, 2 Jan 2024 07:55:59 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. Hi, I am performing a tier1 test with fastdebug build on linux-riscv64. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17216#issuecomment-1873739697 From fyang at openjdk.org Tue Jan 2 09:13:47 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 2 Jan 2024 09:13:47 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 07:55:59 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. LGTM. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17216#pullrequestreview-1800126103 From stefank at openjdk.org Tue Jan 2 09:17:45 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 2 Jan 2024 09:17:45 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 02:49:09 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Seems fine. FWIW, the second usage of `ret` looks redundant. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17210#pullrequestreview-1800130145 From kevinw at openjdk.org Tue Jan 2 13:17:46 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 2 Jan 2024 13:17:46 GMT Subject: RFR: 8319948: jcmd man page needs to be updated In-Reply-To: References: Message-ID: <1qdF5nx16Z34Do7b2kX8LPJrIwlhiMZsgBpXtccDqFg=.60daa558-e999-406a-a556-e85f680947f7@github.com> On Tue, 2 Jan 2024 05:51:59 GMT, David Holmes wrote: > Please review these missing updates to the `jcmd` manpage - see JBS issue for details. > > I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. > > Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). Looks good. Trivially can note that here in jcmd.1 VM.flags says "Prints the VM flag..." while diagnosticCommand.hpp says: "Print VM flag...". We could make it consistent, but not important. "Print" maybe in the minority compared to "Prints", but we have some of each. 8-) ------------- Marked as reviewed by kevinw (Committer). PR Review: https://git.openjdk.org/jdk/pull/17213#pullrequestreview-1800411408 From stefank at openjdk.org Tue Jan 2 15:36:10 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 2 Jan 2024 15:36:10 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v3] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 14:06:43 GMT, Stefan Karlsson wrote: >> [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. >> >> We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. >> >> With these functions it is common to see the following pattern in tests: >> >> ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); >> OutputAnalyzer output = executeProcess(pb); >> >> >> We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: >> >> OutputAnalyzer output = ProcessTools.executeTestJvm(); >> >> >> I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Test cleanup Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17049#issuecomment-1874176578 From stefank at openjdk.org Tue Jan 2 15:36:07 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 2 Jan 2024 15:36:07 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v4] In-Reply-To: References: Message-ID: > [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. > > We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. > > With these functions it is common to see the following pattern in tests: > > ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); > OutputAnalyzer output = executeProcess(pb); > > > We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: > > OutputAnalyzer output = ProcessTools.executeTestJvm(); > > > I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into rename_executeTestJvm - Test cleanup - Fix impl and add test - 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17049/files - new: https://git.openjdk.org/jdk/pull/17049/files/5d488f42..486dc6d5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=02-03 Stats: 5249 lines in 348 files changed: 3069 ins; 973 del; 1207 mod Patch: https://git.openjdk.org/jdk/pull/17049.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17049/head:pull/17049 PR: https://git.openjdk.org/jdk/pull/17049 From pchilanomate at openjdk.org Tue Jan 2 17:14:14 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 2 Jan 2024 17:14:14 GMT Subject: RFR: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index [v4] In-Reply-To: References: Message-ID: > Please review the following fix. The assert fails while verifying the top frame of the stackChunk before returning from a thaw call. The stackChunk is in gc mode but we found a narrow oop for this c2 compiled frame that doesn't have its corresponding bit set. This is because while thawing its callee we cleared the bitmap range associated with the argument area, but this narrow oop happens to land at the very last stack slot of that region. > Loom code assumes the size of the argument area is always a multiple of 2 stack slots, as SharedRuntime::java_calling_convention() shows. But c2 doesn't seem to follow this convention and, knowing the last passed argument only takes one stack slot, it's using the remaining space to store a narrow oop for the caller. There are more details about the specific crash in JBS. > > The initial proposed fix is to just restrict the range of the bitmap we clear by excluding the last stack slot of the argument area, since passed oops are always word aligned. I've also experimented with a patch where I changed SharedRuntime::java_calling_convention() and Fingerprinter::do_type_calling_convention() to not round up the number of stack slots used, and then changed the callers to use a round up value or not depending on the needs [1]. I wasn't convinced it was worthy given we only care about this difference in this Loom code, but I don't mind going with that fix instead. The 3rd alternative would be to just change c2 to not use this stack slot and start spilling at a word aligned offset from the sp. > > I run the patch with the failing test and verified the crash doesn't reproduce anymore. I've also run this patch through loom tiers1-5. > > Thanks, > Patricio > > [1] https://github.com/pchilano/jdk/commit/42ae9269b28beb6f36c502182116545b680e418f Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into JDK-8320275 - add comment in clear_bitmap_bits() - add is_aligned assert in stackChunkOopDesc::bit_index_for - remove round up on java_calling_convention - v1 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16837/files - new: https://git.openjdk.org/jdk/pull/16837/files/4f580f5a..917c4b6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16837&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16837&range=02-03 Stats: 118882 lines in 2468 files changed: 64988 ins; 44550 del; 9344 mod Patch: https://git.openjdk.org/jdk/pull/16837.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16837/head:pull/16837 PR: https://git.openjdk.org/jdk/pull/16837 From kvn at openjdk.org Tue Jan 2 20:23:37 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 2 Jan 2024 20:23:37 GMT Subject: RFR: 8322805: Eliminate -Wparentheses warnings in x86 code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 05:49:06 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17211#pullrequestreview-1800962675 From dholmes at openjdk.org Tue Jan 2 21:51:57 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 Jan 2024 21:51:57 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v2] In-Reply-To: References: Message-ID: > Please review these missing updates to the `jcmd` manpage - see JBS issue for details. > > I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. > > Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). David Holmes has updated the pull request incrementally with one additional commit since the last revision: Fix two formatting errors and make VM.flags consistent with help text ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17213/files - new: https://git.openjdk.org/jdk/pull/17213/files/9e47816e..c410c17f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17213&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17213&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17213.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17213/head:pull/17213 PR: https://git.openjdk.org/jdk/pull/17213 From dholmes at openjdk.org Tue Jan 2 22:01:11 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 Jan 2024 22:01:11 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v3] In-Reply-To: References: Message-ID: > Please review these missing updates to the `jcmd` manpage - see JBS issue for details. > > I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. > > Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). David Holmes has updated the pull request incrementally with one additional commit since the last revision: Consistency pass to match manpage text with help text for VM subcommands ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17213/files - new: https://git.openjdk.org/jdk/pull/17213/files/c410c17f..39cbf52a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17213&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17213&range=01-02 Stats: 13 lines in 1 file changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/17213.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17213/head:pull/17213 PR: https://git.openjdk.org/jdk/pull/17213 From dholmes at openjdk.org Tue Jan 2 22:01:11 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 Jan 2024 22:01:11 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v3] In-Reply-To: <1qdF5nx16Z34Do7b2kX8LPJrIwlhiMZsgBpXtccDqFg=.60daa558-e999-406a-a556-e85f680947f7@github.com> References: <1qdF5nx16Z34Do7b2kX8LPJrIwlhiMZsgBpXtccDqFg=.60daa558-e999-406a-a556-e85f680947f7@github.com> Message-ID: On Tue, 2 Jan 2024 13:15:22 GMT, Kevin Walls wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Consistency pass to match manpage text with help text for VM subcommands > > Looks good. > > Trivially can note that here in jcmd.1 VM.flags says "Prints the VM flag..." while diagnosticCommand.hpp says: "Print VM flag...". > We could make it consistent, but not important. > "Print" maybe in the minority compared to "Prints", but we have some of each. 8-) Thanks for the review @kevinjwalls . I did a consistency sweep over the VM subcommands and changed a lot of `Prints` to `Print` so the manpage matches the help text. Which form is better grammatically I'm not sure, but if we want to change things we should do that for the help text and manpage at the same time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17213#issuecomment-1874615668 From pchilanomate at openjdk.org Tue Jan 2 22:24:45 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 2 Jan 2024 22:24:45 GMT Subject: Integrated: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 00:09:10 GMT, Patricio Chilano Mateo wrote: > Please review the following fix. The assert fails while verifying the top frame of the stackChunk before returning from a thaw call. The stackChunk is in gc mode but we found a narrow oop for this c2 compiled frame that doesn't have its corresponding bit set. This is because while thawing its callee we cleared the bitmap range associated with the argument area, but this narrow oop happens to land at the very last stack slot of that region. > Loom code assumes the size of the argument area is always a multiple of 2 stack slots, as SharedRuntime::java_calling_convention() shows. But c2 doesn't seem to follow this convention and, knowing the last passed argument only takes one stack slot, it's using the remaining space to store a narrow oop for the caller. There are more details about the specific crash in JBS. > > The initial proposed fix is to just restrict the range of the bitmap we clear by excluding the last stack slot of the argument area, since passed oops are always word aligned. I've also experimented with a patch where I changed SharedRuntime::java_calling_convention() and Fingerprinter::do_type_calling_convention() to not round up the number of stack slots used, and then changed the callers to use a round up value or not depending on the needs [1]. I wasn't convinced it was worthy given we only care about this difference in this Loom code, but I don't mind going with that fix instead. The 3rd alternative would be to just change c2 to not use this stack slot and start spilling at a word aligned offset from the sp. > > I run the patch with the failing test and verified the crash doesn't reproduce anymore. I've also run this patch through loom tiers1-5. > > Thanks, > Patricio > > [1] https://github.com/pchilano/jdk/commit/42ae9269b28beb6f36c502182116545b680e418f This pull request has now been integrated. Changeset: e9e694f4 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e Stats: 81 lines in 17 files changed: 38 ins; 9 del; 34 mod 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index Reviewed-by: dlong, fparain ------------- PR: https://git.openjdk.org/jdk/pull/16837 From kbarrett at openjdk.org Tue Jan 2 22:50:06 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 22:50:06 GMT Subject: RFR: 8322805: Eliminate -Wparentheses warnings in x86 code [v2] In-Reply-To: References: Message-ID: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into x86-wparentheses - fix -Wparentheses warnings in x86 code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17211/files - new: https://git.openjdk.org/jdk/pull/17211/files/a6aaba0d..78e5afde Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17211&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17211&range=00-01 Stats: 801 lines in 62 files changed: 530 ins; 40 del; 231 mod Patch: https://git.openjdk.org/jdk/pull/17211.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17211/head:pull/17211 PR: https://git.openjdk.org/jdk/pull/17211 From kbarrett at openjdk.org Tue Jan 2 22:50:06 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 22:50:06 GMT Subject: RFR: 8322805: Eliminate -Wparentheses warnings in x86 code [v2] In-Reply-To: <_-u5L3BKwMmolyBWsAZ2QIEe_H6bTZVICJFBytUIIfg=.0e574c9a-c375-44e6-a83b-5f77f52474c5@github.com> References: <_-u5L3BKwMmolyBWsAZ2QIEe_H6bTZVICJFBytUIIfg=.0e574c9a-c375-44e6-a83b-5f77f52474c5@github.com> Message-ID: On Tue, 2 Jan 2024 07:15:41 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into x86-wparentheses >> - fix -Wparentheses warnings in x86 code > > Looks good, and again trivial IMO. > > Thanks Thanks for reviews @dholmes-ora and @vnkozlov . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17211#issuecomment-1874654490 From kbarrett at openjdk.org Tue Jan 2 22:50:07 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Jan 2024 22:50:07 GMT Subject: Integrated: 8322805: Eliminate -Wparentheses warnings in x86 code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 05:49:06 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. This pull request has now been integrated. Changeset: a6784169 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/a678416994b4efe6e1e659bd247674bea1350905 Stats: 17 lines in 6 files changed: 0 ins; 0 del; 17 mod 8322805: Eliminate -Wparentheses warnings in x86 code Reviewed-by: dholmes, kvn ------------- PR: https://git.openjdk.org/jdk/pull/17211 From cslucas at openjdk.org Tue Jan 2 23:15:52 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 2 Jan 2024 23:15:52 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 19:23:47 GMT, Vladimir Kozlov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > src/hotspot/cpu/x86/x86_32.ad line 1541: > >> 1539: // in the MacroAssembler. Should go away once all "instruct" are >> 1540: // patched to emit bytes only using methods in MacroAssembler. >> 1541: enc_class SetInstMark %{ > > Do you have separate RFE for that? Created this one: https://bugs.openjdk.org/browse/JDK-8322876 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1439937293 From dholmes at openjdk.org Wed Jan 3 01:41:48 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 Jan 2024 01:41:48 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 02:49:09 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. LGTM. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17210#pullrequestreview-1801278687 From gcao at openjdk.org Wed Jan 3 01:49:36 2024 From: gcao at openjdk.org (Gui Cao) Date: Wed, 3 Jan 2024 01:49:36 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 07:55:59 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. Hi, tier1 passed on qemu 8.1.0 with UseRVV (fastdebug) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17216#issuecomment-1874758796 From fyang at openjdk.org Wed Jan 3 06:59:41 2024 From: fyang at openjdk.org (Fei Yang) Date: Wed, 3 Jan 2024 06:59:41 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v3] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 09:57:10 GMT, Robbin Ehn wrote: >> Hi, this is the instructions for zcb. >> >> Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. >> Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. >> I think we need to do some rework here. >> >> I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). >> (macro stuff was originally done when templates where blacklisted in hotspot) >> >> And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). >> >> I have done some modification since it passed tier1, so I'm running stuff over the weekend. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into zcb > - Merge branch 'master' into zcb > - zcb instruction set Seems fine. I only have some minor comments. src/hotspot/cpu/riscv/assembler_riscv.hpp line 542: > 540: INSN(_lbu, 0b0000011, 0b100); // Zcb > 541: INSN(_lh, 0b0000011, 0b001); // Zcb > 542: INSN(_lhu, 0b0000011, 0b101); // Zcb The code comment for these three lines seems a bit misleading. These are normal 4-bytes encoding load/store instructions, not `Zcb` compressed instructions. src/hotspot/cpu/riscv/assembler_riscv.hpp line 2962: > 2960: } > 2961: > 2962: // Format CU, c.[sz]ext.*, c.no Nit: s/c.no/c.not/ src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 497: > 495: inline void zext_b(Register Rd, Register Rs) { > 496: if (do_compress_zcb(Rd, Rs) && > 497: (Rd == Rs)) { Nit: Maybe put the two conditions on the same line to be consistent in style with the other two `notr` & `zext_w` in the same file. `if (do_compress_zcb(Rd, Rs) && (Rd == Rs)) {` ------------- PR Review: https://git.openjdk.org/jdk/pull/17122#pullrequestreview-1801313328 PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1440102490 PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1440113088 PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1440026254 From alanb at openjdk.org Wed Jan 3 07:13:48 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 3 Jan 2024 07:13:48 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v3] In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 22:01:11 GMT, David Holmes wrote: >> Please review these missing updates to the `jcmd` manpage - see JBS issue for details. >> >> I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. >> >> Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Consistency pass to match manpage text with help text for VM subcommands src/jdk.jcmd/share/man/jcmd.1 line 199: > 197: .TP > 198: \f[V]Compiler.directives_add\f[R] \f[I]filename\f[R] \f[I]arguments\f[R] > 199: Adds compiler directives from a file. I might be misreading this one, it looks like there are two arguments but the the help output shows just one, maybe check that one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17213#discussion_r1440134308 From rehn at openjdk.org Wed Jan 3 07:14:55 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 3 Jan 2024 07:14:55 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v10] In-Reply-To: References: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> Message-ID: On Tue, 2 Jan 2024 05:58:13 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed lmul > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3792: > >> 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister v_abef, VectorRegister v_cdgh, >> 3791: bool gen_words = true, bool step_const = true) { >> 3792: __ vl1reXX_v(vset_sew, vtemp, scalarconst); > > Shouldn't we use `vleXX_v` to load the constant for each round instead of `vl1reXX_v`? `vl1reXX_v` which delegates work to `vl1re32_v`/vl1re64_v only loads a single vector register and is not aware of the LMUL setting. I see the openssl version is using `vle32_v`/`vle64_v` to load 4 e32/e64 elements of the constants for each round. Thank you, this revealed an issue in my testing. qemu vlen setting was not properly set. As it should not have passed with 4xe64 with vlen 128. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1440134906 From dholmes at openjdk.org Wed Jan 3 07:19:03 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 Jan 2024 07:19:03 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v4] In-Reply-To: References: Message-ID: > Please review these missing updates to the `jcmd` manpage - see JBS issue for details. > > I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. > > Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). David Holmes has updated the pull request incrementally with one additional commit since the last revision: Fixed Compiler.directives_add ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17213/files - new: https://git.openjdk.org/jdk/pull/17213/files/39cbf52a..6ac454de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17213&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17213&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17213.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17213/head:pull/17213 PR: https://git.openjdk.org/jdk/pull/17213 From dholmes at openjdk.org Wed Jan 3 07:19:05 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 Jan 2024 07:19:05 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v3] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 07:11:02 GMT, Alan Bateman wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Consistency pass to match manpage text with help text for VM subcommands > > src/jdk.jcmd/share/man/jcmd.1 line 199: > >> 197: .TP >> 198: \f[V]Compiler.directives_add\f[R] \f[I]filename\f[R] \f[I]arguments\f[R] >> 199: Adds compiler directives from a file. > > I might be misreading this one, it looks like there are two arguments but the the help output shows just one, maybe check that one. Thanks for catching that! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17213#discussion_r1440136676 From alanb at openjdk.org Wed Jan 3 07:23:38 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 3 Jan 2024 07:23:38 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v4] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 07:19:03 GMT, David Holmes wrote: >> Please review these missing updates to the `jcmd` manpage - see JBS issue for details. >> >> I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. >> >> Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fixed Compiler.directives_add Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17213#pullrequestreview-1801482143 From alanb at openjdk.org Wed Jan 3 07:23:41 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 3 Jan 2024 07:23:41 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v3] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 07:15:55 GMT, David Holmes wrote: >> src/jdk.jcmd/share/man/jcmd.1 line 199: >> >>> 197: .TP >>> 198: \f[V]Compiler.directives_add\f[R] \f[I]filename\f[R] \f[I]arguments\f[R] >>> 199: Adds compiler directives from a file. >> >> I might be misreading this one, it looks like there are two arguments but the the help output shows just one, maybe check that one. > > Thanks for catching that! I didn't spot anything else, the update looks good to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17213#discussion_r1440139492 From dholmes at openjdk.org Wed Jan 3 07:30:47 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 Jan 2024 07:30:47 GMT Subject: RFR: 8319948: jcmd man page needs to be updated [v4] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 07:21:36 GMT, Alan Bateman wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed Compiler.directives_add > > Marked as reviewed by alanb (Reviewer). Thanks for the review @AlanBateman ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17213#issuecomment-1874947985 From stefank at openjdk.org Wed Jan 3 07:55:12 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Jan 2024 07:55:12 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v5] In-Reply-To: References: Message-ID: > [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. > > We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. > > With these functions it is common to see the following pattern in tests: > > ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); > OutputAnalyzer output = executeProcess(pb); > > > We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: > > OutputAnalyzer output = ProcessTools.executeTestJvm(); > > > I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into rename_executeTestJvm - Merge remote-tracking branch 'upstream/master' into rename_executeTestJvm - Test cleanup - Fix impl and add test - 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17049/files - new: https://git.openjdk.org/jdk/pull/17049/files/486dc6d5..755d925d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=03-04 Stats: 875 lines in 70 files changed: 577 ins; 58 del; 240 mod Patch: https://git.openjdk.org/jdk/pull/17049.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17049/head:pull/17049 PR: https://git.openjdk.org/jdk/pull/17049 From stefank at openjdk.org Wed Jan 3 08:55:54 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Jan 2024 08:55:54 GMT Subject: Integrated: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder In-Reply-To: References: Message-ID: <97p3loy_9ZZnMenWO0FMfeACOTWUjesg8dVD6fmYCzs=.160baa3e-3709-4ece-a3aa-986206b73148@github.com> On Mon, 11 Dec 2023 09:15:50 GMT, Stefan Karlsson wrote: > [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. > > We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. > > With these functions it is common to see the following pattern in tests: > > ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); > OutputAnalyzer output = executeProcess(pb); > > > We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: > > OutputAnalyzer output = ProcessTools.executeTestJvm(); > > > I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. This pull request has now been integrated. Changeset: cbe329b9 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/cbe329b90ac1488836d4852fead79aa26c082114 Stats: 262 lines in 89 files changed: 73 ins; 1 del; 188 mod 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder Reviewed-by: lkorinth, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/17049 From epeter at openjdk.org Wed Jan 3 09:11:54 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 3 Jan 2024 09:11:54 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v11] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'master' into JDK-8306767 - removed some ttyl cases, which collided with the extra_data_lock - remove more locking - fix conflicts with tty lock - move a lock to earlier, to have order right with tty lock - missed a case where I need to lock - make lock not safepointing - manual merge with master after JDK-8267532 - more locking, still fails tho - WIP - adding more verification and more locking, WIP - ... and 2 more: https://git.openjdk.org/jdk/compare/cbe329b9...2bda9d7c ------------- Changes: https://git.openjdk.org/jdk/pull/16840/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=10 Stats: 222 lines in 22 files changed: 141 ins; 14 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Wed Jan 3 09:18:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 3 Jan 2024 09:18:59 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v12] In-Reply-To: References: Message-ID: <7R5InxbS8QqBuThh_Jlp_nyfOlagULxUL0l-9GgxFLU=.d89478a3-2550-4a4c-a8ec-04950a0e03a5@github.com> > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: 2024 copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/2bda9d7c..68ca3bd4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=10-11 Stats: 21 lines in 21 files changed: 0 ins; 0 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Wed Jan 3 10:07:06 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 3 Jan 2024 10:07:06 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v13] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: remove override marking, so I don not have to add it everywhere because of clang ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/68ca3bd4..1c19953a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=11-12 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From shade at openjdk.org Wed Jan 3 11:59:48 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 Jan 2024 11:59:48 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 02:49:09 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 198: > 196: && fp() == other.fp() > 197: && pc() == other.pc(); > 198: assert(!ret || (ret && cb() == other.cb() && _deopt_state == other._deopt_state), "inconsistent construction"); Well... Since `||` is short-cutting, then on the right side of `||`, we can be sure that `!ret` was `false` (shortcut not taken), which means `ret` was `true`, which means `ret &&` is redundant? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17210#discussion_r1440371143 From dholmes at openjdk.org Wed Jan 3 12:22:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 Jan 2024 12:22:37 GMT Subject: RFR: 8322920: Some ProcessTools.execute* functions are declared to throw Throwable In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 09:51:24 GMT, Stefan Karlsson wrote: > Most functions in ProcessTools are declared to throw Exceptions, or one of its subclasses. However, there are a small number of functions that are unnecessarily declared to throw Throwable instead of Exception. I propose that we change them to also be declared to throw Exception. > > This is a trivial patch to make it easier to refactor tests to use the updated functions. > > Tested manually, but will wait for GHA to verify that the change is OK. Seems fine and trivial. As long as this compiles it is correct. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17240#pullrequestreview-1801895740 From shade at openjdk.org Wed Jan 3 12:47:59 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 Jan 2024 12:47:59 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v3] In-Reply-To: References: Message-ID: > [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. > > The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. > > Additional testing: > - [x] Large build matrix of server/zero builds > - [x] Linux AArch64 server fastdebug, `tier{1,2}` > - [x] Linux x86_64 server fastdebug, `tier{1,2}` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'master' into JDK-8237842-cache-line-padding-defs - Better verbiage for *2 adjustment for x86_64 - Merge branch 'master' into JDK-8237842-cache-line-padding-defs - Work ------------- Changes: https://git.openjdk.org/jdk/pull/16973/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16973&range=02 Stats: 99 lines in 25 files changed: 27 ins; 15 del; 57 mod Patch: https://git.openjdk.org/jdk/pull/16973.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16973/head:pull/16973 PR: https://git.openjdk.org/jdk/pull/16973 From shade at openjdk.org Wed Jan 3 12:48:02 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 Jan 2024 12:48:02 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work I had to resolve some merge conflicts in `g1ConcurrentMark.hpp` due to recent changes. I am planning to integrate this after GHA turn back green. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1875318639 From stefank at openjdk.org Wed Jan 3 13:26:48 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Jan 2024 13:26:48 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code In-Reply-To: References: Message-ID: <8vCqhYS1D3S--iJxU6ElX7UdLBNc2Xyy9Y-sLLwt2SY=.31ec51fe-6299-41f7-a4d2-cfcc512c1e2f@github.com> On Wed, 3 Jan 2024 11:56:45 GMT, Aleksey Shipilev wrote: >> Please review this change to eliminate some -Wparentheses warnings. This >> involved simply adding a few parentheses to make some implicit operator >> precedence explicit. >> >> Testing: mach5 tier1 >> >> Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses >> and other changes needed to make that work. > > src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 198: > >> 196: && fp() == other.fp() >> 197: && pc() == other.pc(); >> 198: assert(!ret || (ret && cb() == other.cb() && _deopt_state == other._deopt_state), "inconsistent construction"); > > Well... Since `||` is short-cutting, then on the right side of `||`, we can be sure that `!ret` was `false` (shortcut not taken), which means `ret` was `true`, which means `ret &&` is redundant? I agree: https://github.com/openjdk/jdk/pull/17210#pullrequestreview-1800130145 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17210#discussion_r1440448619 From duke at openjdk.org Wed Jan 3 13:57:59 2024 From: duke at openjdk.org (Thomas Wuerthinger) Date: Wed, 3 Jan 2024 13:57:59 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 10:40:23 GMT, Serguei Spitsyn wrote: >>> You can't do this! The Java code knows nothing about JVM TI being enabled/disabled and will call this function unconditionally. >> >> Indeed. I wonder if anyone is testing minimal builds to catch issues like this. > > Good catch, David! > Filed a cleanup bug: https://bugs.openjdk.org/browse/JDK-8322538 Are these new compiler intrinsics required or an optional performance optimization? This PR creates issues for us when updating the JDK build for Graal. CC @davleopo ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1440478960 From alanb at openjdk.org Wed Jan 3 14:06:53 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 3 Jan 2024 14:06:53 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 13:55:24 GMT, Thomas Wuerthinger wrote: > Are these new compiler intrinsics required or an optional performance optimization? Performance. If the intrinsic isn't there then some methods executed on virtual threads, or on a virtual thread as the target for some op, will have to call into the VM. The main concern was Thread.interrupted() as it gets called very frequently in locking and concurrency code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1440487675 From epeter at openjdk.org Wed Jan 3 14:12:15 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 3 Jan 2024 14:12:15 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v14] In-Reply-To: References: Message-ID: <4dd_LTPDH-entBfAAyf5XzGE6sI4C0uqwP5zporK__I=.9793910d-a67f-4050-b789-49f4ddd3cae1@github.com> > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: jfr case with missing lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/1c19953a..6647e11e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=12-13 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From luhenry at openjdk.org Wed Jan 3 17:16:37 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Wed, 3 Jan 2024 17:16:37 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code In-Reply-To: References: Message-ID: <5zFH1FuOfJC2VUtDtmKASW47r2479BvGyODj8c1ntF4=.c0922330-66c9-44aa-b9fc-b0bf60da8657@github.com> On Tue, 2 Jan 2024 07:55:59 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17216#pullrequestreview-1802748825 From fparain at openjdk.org Wed Jan 3 20:08:28 2024 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 3 Jan 2024 20:08:28 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v4] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Fri, 22 Dec 2023 05:08:19 GMT, Matias Saavedra Silva wrote: >> The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. >> >> This change was tested with Spring Petclinic which reported the following startup times: >> >> Clean build: #### Booted and returned in 161941ms >> Patched build: #### Booted and returned in 160657ms > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Added comment to remaining platforms LGTM ------------- Marked as reviewed by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17006#pullrequestreview-1802997314 From matsaave at openjdk.org Wed Jan 3 20:13:41 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 3 Jan 2024 20:13:41 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v4] In-Reply-To: <8tff1HjtQpBC-XjBpbp05a0gVVsAxc90dT9fPTHAPfE=.33c0d809-919c-4ae7-9eae-1e740683503d@github.com> References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> <8tff1HjtQpBC-XjBpbp05a0gVVsAxc90dT9fPTHAPfE=.33c0d809-919c-4ae7-9eae-1e740683503d@github.com> Message-ID: On Fri, 22 Dec 2023 06:36:55 GMT, David Holmes wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment to remaining platforms > > Thanks Thank you for the reviews @dholmes-ora and @fparain! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17006#issuecomment-1875909210 From matsaave at openjdk.org Wed Jan 3 20:13:41 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 3 Jan 2024 20:13:41 GMT Subject: Integrated: 8320276: Improve class initialization barrier in TemplateTable::_new In-Reply-To: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Wed, 6 Dec 2023 22:02:19 GMT, Matias Saavedra Silva wrote: > The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. > > This change was tested with Spring Petclinic which reported the following startup times: > > Clean build: #### Booted and returned in 161941ms > Patched build: #### Booted and returned in 160657ms This pull request has now been integrated. Changeset: 409a39ec Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/409a39ec8da83d6a0895e7e213604455ebf50485 Stats: 16 lines in 6 files changed: 6 ins; 2 del; 8 mod 8320276: Improve class initialization barrier in TemplateTable::_new Reviewed-by: dholmes, fparain ------------- PR: https://git.openjdk.org/jdk/pull/17006 From dholmes at openjdk.org Wed Jan 3 22:33:30 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 3 Jan 2024 22:33:30 GMT Subject: Integrated: 8319948: jcmd man page needs to be updated In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 05:51:59 GMT, David Holmes wrote: > Please review these missing updates to the `jcmd` manpage - see JBS issue for details. > > I also fixed the sub-command ordering in a few places and a couple of minor formatting fixes. > > Thanks to @tstuefe for collating the initial information on what was missing. I used to `jcmd help ` option to compare the help text with the manpage text and adjusted the manpage accordingly (and found a few more omissions - some of which will be fixed by follow up RFE by the respective owners). This pull request has now been integrated. Changeset: 028ec7e7 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/028ec7e744f06cd8429b7b74d7b6f7020133aa94 Stats: 269 lines in 1 file changed: 171 ins; 52 del; 46 mod 8319948: jcmd man page needs to be updated Co-authored-by: Thomas Stuefe Reviewed-by: kevinw, alanb ------------- PR: https://git.openjdk.org/jdk/pull/17213 From ddong at openjdk.org Thu Jan 4 03:22:32 2024 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 4 Jan 2024 03:22:32 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> Message-ID: On Sat, 16 Dec 2023 04:36:57 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > refine description Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16976#issuecomment-1876254452 From ddong at openjdk.org Thu Jan 4 03:22:34 2024 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 4 Jan 2024 03:22:34 GMT Subject: Integrated: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC In-Reply-To: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Message-ID: On Tue, 5 Dec 2023 16:31:24 GMT, Denghui Dong wrote: > Hi, > > Could I have a review of this patch? > > In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. > > This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. > > Best, > Denghui This pull request has now been integrated. Changeset: 1cf9335b Author: Denghui Dong URL: https://git.openjdk.org/jdk/commit/1cf9335b24639938aa64250d6862d9636f8605f8 Stats: 57 lines in 3 files changed: 53 ins; 0 del; 4 mod 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC Reviewed-by: dholmes, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/16976 From fyang at openjdk.org Thu Jan 4 03:25:27 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 4 Jan 2024 03:25:27 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v10] In-Reply-To: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> References: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> Message-ID: On Fri, 22 Dec 2023 14:10:13 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > fixed lmul src/hotspot/cpu/riscv/vm_version_riscv.cpp line 169: > 167: if (UseRVV) { > 168: if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { > 169: FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); As the code comment in shared code indicates, there should be a dependency between these flags and `UseSHA`. But seems we are still lacking the necessary logic for that and `UseSHA` is always false [2]. We might need similar handling like our aarch64 counterpart [3]. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/globals.hpp#L342 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/vm_version_riscv.cpp#L151 [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp#L323-L376 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1441242041 From cslucas at openjdk.org Thu Jan 4 04:00:08 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 4 Jan 2024 04:00:08 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v7] In-Reply-To: References: Message-ID: <9dfnMnNzjdygds7D2lnexBhUSnFGN60q6xJdRWhbHWs=.6a9148ce-4929-4af5-b9d4-46c5c81bc1c7@github.com> > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > # Tier-1 Testing status > > | | Win | Mac | Linux | > |----------|---------|---------|---------| > | ARM64 | ? | ? | | > | ARM32 | n/a | n/a | | > | x86 | | | ? | > | x64 | ? | ? | ? | > | PPC64 | n/a | n/a | | > | S390x | n/a | n/a | | > | RiscV | n/a | n/a | ? | Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Catch up with origin/master - Merge with origin/master - Fix build, copyright dates, m4 files. - Fix merge - Catch up with master branch. Merge remote-tracking branch 'origin/master' into reuse-macroasm - Some inst_mark fixes; Catch up with master. - Catch up with changes on master - Reuse same C2_MacroAssembler object to emit instructions. ------------- Changes: https://git.openjdk.org/jdk/pull/16484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=06 Stats: 2446 lines in 61 files changed: 106 ins; 434 del; 1906 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From dholmes at openjdk.org Thu Jan 4 06:04:28 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 Jan 2024 06:04:28 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> Message-ID: On Thu, 4 Jan 2024 03:19:17 GMT, Denghui Dong wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> refine description > > Thanks for the review. @D-D-H the new test is failing in our CI in tier3 on all platforms: Error occurred during initialization of VM Multiple garbage collectors selected ------------- PR Comment: https://git.openjdk.org/jdk/pull/16976#issuecomment-1876414177 From dholmes at openjdk.org Thu Jan 4 06:13:38 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 Jan 2024 06:13:38 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> Message-ID: On Sat, 16 Dec 2023 04:36:57 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > refine description Filed JDK-8322989 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16976#issuecomment-1876432026 From ddong at openjdk.org Thu Jan 4 07:32:32 2024 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 4 Jan 2024 07:32:32 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> Message-ID: On Thu, 4 Jan 2024 03:19:17 GMT, Denghui Dong wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> refine description > > Thanks for the review. > @D-D-H the new test is failing in our CI in tier3 on all platforms: > > Error occurred during initialization of VM Multiple garbage collectors selected Sorry for that, I'll take a look at it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16976#issuecomment-1876600685 From fjiang at openjdk.org Thu Jan 4 07:42:21 2024 From: fjiang at openjdk.org (Feilong Jiang) Date: Thu, 4 Jan 2024 07:42:21 GMT Subject: RFR: 8322583: RISC-V: Enable fast class initialization checks In-Reply-To: References: Message-ID: On Tue, 26 Dec 2023 07:14:22 GMT, Gui Cao wrote: > Hi, Please review this small change enabling fast class initialization checks on linux-riscv. As discussed on [1], we noticed that VM_Version::supports_fast_class_init_checks is false on linux-riscv64 platform. But the needed code of this optimization on this platform is there which is supposed to solve a performance issue https://bugs.openjdk.org/browse/JDK-8219233 at that time. I found that this performance issue is still reproduciable on linux-riscv64. And the original performance issue reported by JDK-8219233 is resolved when this optimization is enabled (x2.5 improvement for reported case on linux-riscv64). > > Once this is in, I'd like to request approval for backporting to JDK 21u and JDK 17u since corresponding fixes for the other ports are already there, unless the community feels otherwise. > > [1] https://github.com/openjdk/jdk/pull/17006#issuecomment-1865582796 > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns false(default): > > > $ clojure -A:user > "Elapsed time: 91597.717028 msecs" > $ clojure -A:user > "Elapsed time: 91851.01435 msecs" > $ clojure -A:user > "Elapsed time: 92149.106378 msecs" > $ clojure foo.clj > "Elapsed time: 35663.36249 msecs" > $ clojure foo.clj > "Elapsed time: 35677.387338 msecs" > $ clojure foo.clj > "Elapsed time: 35253.330701 msecs" > > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns true: > > > $ clojure -A:user > "Elapsed time: 37425.063258 msecs" > $ clojure -A:user > "Elapsed time: 36338.252422 msecs" > $ clojure -A:user > "Elapsed time: 37441.73001 msecs" > $ clojure foo.clj > "Elapsed time: 36018.857489 msecs" > $ clojure foo.clj > "Elapsed time: 34750.533297 msecs" > $ clojure foo.clj > "Elapsed time: 35499.890121 msecs" > > > ### Testing: > > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (fastdebug) > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (release) > - [x] Run tier1-3, hotspot:tier4 tests with SiFive unmatched (release) Looks good, thanks! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/17192#pullrequestreview-1803633662 From shade at openjdk.org Thu Jan 4 08:42:23 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 Jan 2024 08:42:23 GMT Subject: Integrated: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 11:23:31 GMT, Aleksey Shipilev wrote: > [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. > > The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. > > Additional testing: > - [x] Large build matrix of server/zero builds > - [x] Linux AArch64 server fastdebug, `tier{1,2}` > - [x] Linux x86_64 server fastdebug, `tier{1,2}` This pull request has now been integrated. Changeset: dd517c64 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/dd517c64047705d706b095d15d9fd4e0703ab39b Stats: 99 lines in 25 files changed: 27 ins; 15 del; 57 mod 8237842: Separate definitions for default cache line and padding sizes Reviewed-by: stefank, kvn, stuefe, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/16973 From epeter at openjdk.org Thu Jan 4 10:35:23 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 4 Jan 2024 10:35:23 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v14] In-Reply-To: <4dd_LTPDH-entBfAAyf5XzGE6sI4C0uqwP5zporK__I=.9793910d-a67f-4050-b789-49f4ddd3cae1@github.com> References: <4dd_LTPDH-entBfAAyf5XzGE6sI4C0uqwP5zporK__I=.9793910d-a67f-4050-b789-49f4ddd3cae1@github.com> Message-ID: On Wed, 3 Jan 2024 14:12:15 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Complications with ttyl** >> There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. >> >> If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > jfr case with missing lock Just fixed an issue from higher tier testing. And I ran performance testing, I think there is no significant difference, so we can go with the simple locking approach that I have now taken. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1876865555 From bulasevich at openjdk.org Thu Jan 4 11:02:33 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 4 Jan 2024 11:02:33 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments Message-ID: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> The change simplifies the CodeCache::initialize_heaps segment memory split logic to make it easier to add new segments to the code cache: if (!non_nmethod_set && !profiled_set && !non_profiled_set) { ... } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { if (non_profiled_set) { if (!profiled_set) { ... } } else if (profiled_set) { ... } else if (non_nmethod_set) { ... } } The existing layout is retained. With this change, PrintFlagsFinal always shows the actual segment sizes (not an intermediate value before alignment), and the segments always completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). ------------- Commit messages: - test update - cleanup2 - cleanup1 - 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments Changes: https://git.openjdk.org/jdk/pull/17244/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311248 Stats: 361 lines in 5 files changed: 213 ins; 103 del; 45 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From epeter at openjdk.org Thu Jan 4 12:30:51 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 4 Jan 2024 12:30:51 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v15] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: refactor MethodData::bci_to_extra_data - remove redundant code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/6647e11e..b0ff5d18 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=13-14 Stats: 26 lines in 2 files changed: 0 ins; 17 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Thu Jan 4 12:52:52 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 4 Jan 2024 12:52:52 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: fixed typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/b0ff5d18..e1e91741 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=14-15 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From dchuyko at openjdk.org Thu Jan 4 13:26:33 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 4 Jan 2024 13:26:33 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v20] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 38 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 28 more: https://git.openjdk.org/jdk/compare/27d5f5c2...691ea329 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=19 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From avoitylov at openjdk.org Thu Jan 4 18:22:30 2024 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Thu, 4 Jan 2024 18:22:30 GMT Subject: [jdk22] Integrated: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 11:54:53 GMT, Aleksei Voitylov wrote: > Hi all, > > This pull request contains a backport of commit [f573f6d2](https://github.com/openjdk/jdk/commit/f573f6d233d5ea1657018c3c806fee0fac382ac3) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Aleksei Voitylov on 13 Dec 2023 and was reviewed by Aleksey Shipilev. > > Thanks! This pull request has now been integrated. Changeset: a6e35650 Author: Aleksei Voitylov Committer: Dmitry Chuyko URL: https://git.openjdk.org/jdk22/commit/a6e35650f9a643d9a1dabda93656a74fa49cf8dd Stats: 33 lines in 3 files changed: 14 ins; 4 del; 15 mod 8321515: ARM32: Move method resolution information out of the cpCache properly Reviewed-by: shade Backport-of: f573f6d233d5ea1657018c3c806fee0fac382ac3 ------------- PR: https://git.openjdk.org/jdk22/pull/17 From pchilanomate at openjdk.org Thu Jan 4 19:43:34 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 4 Jan 2024 19:43:34 GMT Subject: [jdk22] RFR: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index Message-ID: Hi all, This pull request contains a backport of commit [e9e694f4](https://github.com/openjdk/jdk/commit/e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Patricio Chilano Mateo on 2 Jan 2024 and was reviewed by Dean Long and Frederic Parain. Thanks! ------------- Commit messages: - Backport e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e Changes: https://git.openjdk.org/jdk22/pull/29/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=29&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320275 Stats: 81 lines in 17 files changed: 38 ins; 9 del; 34 mod Patch: https://git.openjdk.org/jdk22/pull/29.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/29/head:pull/29 PR: https://git.openjdk.org/jdk22/pull/29 From dholmes at openjdk.org Thu Jan 4 22:44:25 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 Jan 2024 22:44:25 GMT Subject: [jdk22] RFR: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index In-Reply-To: References: Message-ID: <19BvBk3HH83WmxO7f1zRZkYgEXz_9k8C6fX_NnWgqkA=.2c35e2fb-2cdc-4c98-aca6-7c438f85a529@github.com> On Thu, 4 Jan 2024 15:22:16 GMT, Patricio Chilano Mateo wrote: > Hi all, > > This pull request contains a backport of commit [e9e694f4](https://github.com/openjdk/jdk/commit/e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Patricio Chilano Mateo on 2 Jan 2024 and was reviewed by Dean Long and Frederic Parain. > > Thanks! Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/29#pullrequestreview-1805076197 From vlivanov at openjdk.org Thu Jan 4 23:50:29 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 4 Jan 2024 23:50:29 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v4] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Fri, 22 Dec 2023 05:08:19 GMT, Matias Saavedra Silva wrote: >> The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. >> >> This change was tested with Spring Petclinic which reported the following startup times: >> >> Clean build: #### Booted and returned in 161941ms >> Patched build: #### Booted and returned in 160657ms > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Added comment to remaining platforms Thanks for taking care of the enhancement, Matias! I'm late to the party, but have one suggestion for a future cleanup. src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 3607: > 3605: > 3606: // make sure klass is initialized > 3607: assert(VM_Version::supports_fast_class_init_checks(), "Optimization requires support for fast class initialization checks"); A better place to put the assert would be on callee side (in `MacroAssembler::clinit_barrier()`). ------------- PR Review: https://git.openjdk.org/jdk/pull/17006#pullrequestreview-1805159391 PR Review Comment: https://git.openjdk.org/jdk/pull/17006#discussion_r1442372252 From gcao at openjdk.org Fri Jan 5 03:47:22 2024 From: gcao at openjdk.org (Gui Cao) Date: Fri, 5 Jan 2024 03:47:22 GMT Subject: RFR: 8322583: RISC-V: Enable fast class initialization checks In-Reply-To: References: Message-ID: On Tue, 26 Dec 2023 07:14:22 GMT, Gui Cao wrote: > Hi, Please review this small change enabling fast class initialization checks on linux-riscv. As discussed on [1], we noticed that VM_Version::supports_fast_class_init_checks is false on linux-riscv64 platform. But the needed code of this optimization on this platform is there which is supposed to solve a performance issue https://bugs.openjdk.org/browse/JDK-8219233 at that time. I found that this performance issue is still reproduciable on linux-riscv64. And the original performance issue reported by JDK-8219233 is resolved when this optimization is enabled (x2.5 improvement for reported case on linux-riscv64). > > Once this is in, I'd like to request approval for backporting to JDK 21u and JDK 17u since corresponding fixes for the other ports are already there, unless the community feels otherwise. > > [1] https://github.com/openjdk/jdk/pull/17006#issuecomment-1865582796 > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns false(default): > > > $ clojure -A:user > "Elapsed time: 91597.717028 msecs" > $ clojure -A:user > "Elapsed time: 91851.01435 msecs" > $ clojure -A:user > "Elapsed time: 92149.106378 msecs" > $ clojure foo.clj > "Elapsed time: 35663.36249 msecs" > $ clojure foo.clj > "Elapsed time: 35677.387338 msecs" > $ clojure foo.clj > "Elapsed time: 35253.330701 msecs" > > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns true: > > > $ clojure -A:user > "Elapsed time: 37425.063258 msecs" > $ clojure -A:user > "Elapsed time: 36338.252422 msecs" > $ clojure -A:user > "Elapsed time: 37441.73001 msecs" > $ clojure foo.clj > "Elapsed time: 36018.857489 msecs" > $ clojure foo.clj > "Elapsed time: 34750.533297 msecs" > $ clojure foo.clj > "Elapsed time: 35499.890121 msecs" > > > ### Testing: > > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (fastdebug) > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (release) > - [x] Run tier1-3, hotspot:tier4 tests with SiFive unmatched (release) Thanks all for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17192#issuecomment-1878082132 From gcao at openjdk.org Fri Jan 5 04:42:30 2024 From: gcao at openjdk.org (Gui Cao) Date: Fri, 5 Jan 2024 04:42:30 GMT Subject: Integrated: 8322583: RISC-V: Enable fast class initialization checks In-Reply-To: References: Message-ID: On Tue, 26 Dec 2023 07:14:22 GMT, Gui Cao wrote: > Hi, Please review this small change enabling fast class initialization checks on linux-riscv. As discussed on [1], we noticed that VM_Version::supports_fast_class_init_checks is false on linux-riscv64 platform. But the needed code of this optimization on this platform is there which is supposed to solve a performance issue https://bugs.openjdk.org/browse/JDK-8219233 at that time. I found that this performance issue is still reproduciable on linux-riscv64. And the original performance issue reported by JDK-8219233 is resolved when this optimization is enabled (x2.5 improvement for reported case on linux-riscv64). > > Once this is in, I'd like to request approval for backporting to JDK 21u and JDK 17u since corresponding fixes for the other ports are already there, unless the community feels otherwise. > > [1] https://github.com/openjdk/jdk/pull/17006#issuecomment-1865582796 > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns false(default): > > > $ clojure -A:user > "Elapsed time: 91597.717028 msecs" > $ clojure -A:user > "Elapsed time: 91851.01435 msecs" > $ clojure -A:user > "Elapsed time: 92149.106378 msecs" > $ clojure foo.clj > "Elapsed time: 35663.36249 msecs" > $ clojure foo.clj > "Elapsed time: 35677.387338 msecs" > $ clojure foo.clj > "Elapsed time: 35253.330701 msecs" > > > OpenJDK23 linux-riscv64 when setting VM_Version::supports_fast_class_init_checks returns true: > > > $ clojure -A:user > "Elapsed time: 37425.063258 msecs" > $ clojure -A:user > "Elapsed time: 36338.252422 msecs" > $ clojure -A:user > "Elapsed time: 37441.73001 msecs" > $ clojure foo.clj > "Elapsed time: 36018.857489 msecs" > $ clojure foo.clj > "Elapsed time: 34750.533297 msecs" > $ clojure foo.clj > "Elapsed time: 35499.890121 msecs" > > > ### Testing: > > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (fastdebug) > - [x] Run tier1-3 tests on qemu 8.1.0 with UseRVV (release) > - [x] Run tier1-3, hotspot:tier4 tests with SiFive unmatched (release) This pull request has now been integrated. Changeset: 5235cc98 Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/5235cc987d8c4455622acda947bed7321086a385 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8322583: RISC-V: Enable fast class initialization checks Reviewed-by: rehn, fyang, fjiang ------------- PR: https://git.openjdk.org/jdk/pull/17192 From lmesnik at openjdk.org Fri Jan 5 07:00:20 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 5 Jan 2024 07:00:20 GMT Subject: RFR: 8322920: Some ProcessTools.execute* functions are declared to throw Throwable In-Reply-To: References: Message-ID: <4rCL43GABcMC2SMvhVGi_31M9mUeA6QODULxeAKSE-k=.3ef2cfab-1d34-4821-a27b-46a34e50d5cc@github.com> On Wed, 3 Jan 2024 09:51:24 GMT, Stefan Karlsson wrote: > Most functions in ProcessTools are declared to throw Exceptions, or one of its subclasses. However, there are a small number of functions that are unnecessarily declared to throw Throwable instead of Exception. I propose that we change them to also be declared to throw Exception. > > This is a trivial patch to make it easier to refactor tests to use the updated functions. > > Tested manually, but will wait for GHA to verify that the change is OK. You need to update copyrights. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17240#pullrequestreview-1805436752 From stefank at openjdk.org Fri Jan 5 08:22:41 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Jan 2024 08:22:41 GMT Subject: RFR: 8322920: Some ProcessTools.execute* functions are declared to throw Throwable [v2] In-Reply-To: References: Message-ID: <-LmjrWlv9MLQKE-D_pYAvoON1RcsV_nXZV0FSV9eU6I=.4c035aa5-eb06-45ba-94d4-c23a37e949f4@github.com> > Most functions in ProcessTools are declared to throw Exceptions, or one of its subclasses. However, there are a small number of functions that are unnecessarily declared to throw Throwable instead of Exception. I propose that we change them to also be declared to throw Exception. > > This is a trivial patch to make it easier to refactor tests to use the updated functions. > > Tested manually, but will wait for GHA to verify that the change is OK. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Copyright year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17240/files - new: https://git.openjdk.org/jdk/pull/17240/files/910a863c..402b6727 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17240&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17240&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17240.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17240/head:pull/17240 PR: https://git.openjdk.org/jdk/pull/17240 From rehn at openjdk.org Fri Jan 5 08:26:10 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 08:26:10 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v11] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: - Fixed comment - Fixed flags - Fixed vlen 128 - Merge branch 'master' into sha256 - fixed lmul - remove merge, renames - Easier reg layout and 128/m2 - Minor update - index store state back - t2 caller saved, no need to push/pop - ... and 10 more: https://git.openjdk.org/jdk/compare/2dbeeb7f...01105251 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/eefcd269..01105251 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=09-10 Stats: 1005 lines in 62 files changed: 519 ins; 292 del; 194 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Fri Jan 5 08:26:11 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 08:26:11 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v10] In-Reply-To: References: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> Message-ID: On Wed, 3 Jan 2024 07:12:15 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3792: >> >>> 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister v_abef, VectorRegister v_cdgh, >>> 3791: bool gen_words = true, bool step_const = true) { >>> 3792: __ vl1reXX_v(vset_sew, vtemp, scalarconst); >> >> Shouldn't we use `vleXX_v` to load the constant for each round instead of `vl1reXX_v`? `vl1reXX_v` which delegates work to `vl1re32_v`/vl1re64_v only loads a single vector register and is not aware of the LMUL setting. I see the openssl version is using `vle32_v`/`vle64_v` to load 4 e32/e64 elements of the constants for each round. > > Thank you, this revealed an issue in my testing. qemu vlen setting was not properly set. > As it should not have passed with 4xe64 with vlen 128. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1442604552 From rehn at openjdk.org Fri Jan 5 08:26:16 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 08:26:16 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v10] In-Reply-To: References: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> Message-ID: On Sat, 23 Dec 2023 03:06:30 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed lmul > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3934: > >> 3932: // >> 3933: // e32/e64: vector of 32b/64b/4B/8B elements >> 3934: // m1: LMUL=1 > > This line of comment needs to be updated to reflect the latest changes. Fixed > src/hotspot/cpu/riscv/vm_version_riscv.cpp line 169: > >> 167: if (UseRVV) { >> 168: if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >> 169: FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); > > As the code comment in shared code indicates [1], there should be a dependency between these flags and `UseSHA`. But seems we are still lacking the necessary logic for that and `UseSHA` is always false [2]. We might need similar handling like our aarch64 counterpart [3]. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/globals.hpp#L342 > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/vm_version_riscv.cpp#L151 > [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp#L323-L376 Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1442604686 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1442604860 From rehn at openjdk.org Fri Jan 5 08:29:28 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 08:29:28 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v11] In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 08:26:10 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: > > - Fixed comment > - Fixed flags > - Fixed vlen 128 > - Merge branch 'master' into sha256 > - fixed lmul > - remove merge, renames > - Easier reg layout and 128/m2 > - Minor update > - index store state back > - t2 caller saved, no need to push/pop > - ... and 10 more: https://git.openjdk.org/jdk/compare/a79d2ea2...01105251 Ran the tests (compiler/intrinsics/sha/) with -XX:+PrintFlagsFinal and verified the MaxVectorSize was 16/32. Manually tested the flags. Thanks, Robbin ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1878297512 From sroy at openjdk.org Fri Jan 5 08:55:28 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 5 Jan 2024 08:55:28 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <_WrW-iHHdU-IgC7Z1b6oe_Qh0dkC6P3KJAdl7J2S1Do=.712dd065-6207-4632-a82f-8e12ad023cd5@github.com> <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> Message-ID: On Thu, 21 Dec 2023 10:01:04 GMT, Thomas Stuefe wrote: >>> > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? >>> >>> > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. >>> >>> No, what I meant, and what must be clarified before going forward with this solution, is the following: >>> >>> * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object >>> * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. >>> >>> Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. >>> If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. >>> >> In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. >> If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? >> >> >>> > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? >>> >>> > I am not sure how J9 handles this. I would have to consult . >>> >>> J9 is Open Source, can't you just look? :) >> >> I did try comparing the file structures, and i do not see a similar file structure over there. >> I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? >>> >>> > However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. >>> >>> I don't know what Semuru is. What is the context, is that a different VM? Also OpenJDK? J9 derived? >> >> >> Semuru is J9 derived. > >> > > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? >> > >> > >> > > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. >> > >> > >> > No, what I meant, and what must be clarified before going forward with this solution, is the following: >> > >> > * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object >> > * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. >> > >> > Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. >> > If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. >> >> In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? > > Rather, this is a question you have to ask your collegues at IBM that develop the AIX libc. > > Since AIX libc is not open source, we cannot look for ourselves, nor can Joachim (her works at SAP). > >> >> > > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? >> > >> > >> > > I am not sure how J9 handles this. I would have to consult . >> > >> > >> > J9 is Open Source, can't you just look? :) >> >> I did try comparing the file structures, and i do not see a similar file structure over there. I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? > > Someone must implement LoadLibrary. Try looking for places where dlopen() is called. > >> >> > > However as per current observation, this issue does ... Hi @tstuefe Clarifications on your questions. > Hi, > > some requests and questions: > > * Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? In J9, there was syshelp native file that did the appending of .a to the filename till java 14. Post that the appending is handled in the JavaClassLoader . However, when I tried to do the same on OpenJDK, it didn't work. So I had to trace into the hotspot code. This change will affect some Panama changes too. I think @TheRealMDoerr has faced issues with respect to another library as well. Long term, providing a .so variant may not be feasible as it will be continuous process imo. > * What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? I enquired on this. The dlopen exits with an error .The program does not crash when trying to load a static library. > * What happens if the original path handed to os::dll_load is already a *.a file? Should the logic then be reversed? Already explained above. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1878327576 From rehn at openjdk.org Fri Jan 5 09:06:47 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 09:06:47 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v4] In-Reply-To: References: Message-ID: > Hi, this is the instructions for zcb. > > Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. > Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. > I think we need to do some rework here. > > I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). > (macro stuff was originally done when templates where blacklisted in hotspot) > > And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). > > I have done some modification since it passed tier1, so I'm running stuff over the weekend. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into zcb - Merge branch 'master' into zcb - Merge branch 'master' into zcb - zcb instruction set ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17122/files - new: https://git.openjdk.org/jdk/pull/17122/files/4fa46f3c..c30caa0e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=02-03 Stats: 4265 lines in 324 files changed: 2613 ins; 663 del; 989 mod Patch: https://git.openjdk.org/jdk/pull/17122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17122/head:pull/17122 PR: https://git.openjdk.org/jdk/pull/17122 From rehn at openjdk.org Fri Jan 5 09:05:57 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 09:05:57 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v12] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: - Merge branch 'master' into sha256 - Fixed comment - Fixed flags - Fixed vlen 128 - Merge branch 'master' into sha256 - fixed lmul - remove merge, renames - Easier reg layout and 128/m2 - Minor update - index store state back - ... and 11 more: https://git.openjdk.org/jdk/compare/2838a364...2442b9c6 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/01105251..2442b9c6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=10-11 Stats: 3353 lines in 268 files changed: 2127 ins; 412 del; 814 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From stefank at openjdk.org Fri Jan 5 09:10:34 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Jan 2024 09:10:34 GMT Subject: RFR: 8322920: Some ProcessTools.execute* functions are declared to throw Throwable [v2] In-Reply-To: <-LmjrWlv9MLQKE-D_pYAvoON1RcsV_nXZV0FSV9eU6I=.4c035aa5-eb06-45ba-94d4-c23a37e949f4@github.com> References: <-LmjrWlv9MLQKE-D_pYAvoON1RcsV_nXZV0FSV9eU6I=.4c035aa5-eb06-45ba-94d4-c23a37e949f4@github.com> Message-ID: <2feO9th9WhvBCIf18liAUgASvP_3i51NHk0dcHsxpdI=.60a3c59d-cec0-43f4-bca5-68c80191596f@github.com> On Fri, 5 Jan 2024 08:22:41 GMT, Stefan Karlsson wrote: >> Most functions in ProcessTools are declared to throw Exceptions, or one of its subclasses. However, there are a small number of functions that are unnecessarily declared to throw Throwable instead of Exception. I propose that we change them to also be declared to throw Exception. >> >> This is a trivial patch to make it easier to refactor tests to use the updated functions. >> >> Tested manually, but will wait for GHA to verify that the change is OK. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Copyright year Thanks for the reviews. Testing with Tier1-3 passes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17240#issuecomment-1878341513 From stefank at openjdk.org Fri Jan 5 09:10:36 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Jan 2024 09:10:36 GMT Subject: Integrated: 8322920: Some ProcessTools.execute* functions are declared to throw Throwable In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 09:51:24 GMT, Stefan Karlsson wrote: > Most functions in ProcessTools are declared to throw Exceptions, or one of its subclasses. However, there are a small number of functions that are unnecessarily declared to throw Throwable instead of Exception. I propose that we change them to also be declared to throw Exception. > > This is a trivial patch to make it easier to refactor tests to use the updated functions. > > Tested manually, but will wait for GHA to verify that the change is OK. This pull request has now been integrated. Changeset: 868f8745 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/868f8745faf70c915d8294ae8f85b2d6aa096900 Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod 8322920: Some ProcessTools.execute* functions are declared to throw Throwable Reviewed-by: dholmes, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/17240 From rehn at openjdk.org Fri Jan 5 09:18:24 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 09:18:24 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v3] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 06:08:26 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into zcb >> - Merge branch 'master' into zcb >> - zcb instruction set > > src/hotspot/cpu/riscv/assembler_riscv.hpp line 542: > >> 540: INSN(_lbu, 0b0000011, 0b100); // Zcb >> 541: INSN(_lh, 0b0000011, 0b001); // Zcb >> 542: INSN(_lhu, 0b0000011, 0b101); // Zcb > > The code comment for these three lines seems a bit misleading. These are normal 4-bytes encoding load/store instructions, not `Zcb` compressed instructions. The comment was suppose to mean, these are 'overridden' by zcb, not C as the others. I'll clarify that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1442650110 From rehn at openjdk.org Fri Jan 5 09:33:35 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 09:33:35 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v5] In-Reply-To: References: Message-ID: > Hi, this is the instructions for zcb. > > Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. > Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. > I think we need to do some rework here. > > I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). > (macro stuff was originally done when templates where blacklisted in hotspot) > > And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). > > I have done some modification since it passed tier1, so I'm running stuff over the weekend. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Review fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17122/files - new: https://git.openjdk.org/jdk/pull/17122/files/c30caa0e..f677ca34 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=03-04 Stats: 7 lines in 2 files changed: 0 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17122/head:pull/17122 PR: https://git.openjdk.org/jdk/pull/17122 From rehn at openjdk.org Fri Jan 5 09:55:26 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 09:55:26 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v3] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 06:56:21 GMT, Fei Yang wrote: > Seems fine. I only have some minor comments. Thank you! > src/hotspot/cpu/riscv/assembler_riscv.hpp line 2962: > >> 2960: } >> 2961: >> 2962: // Format CU, c.[sz]ext.*, c.no > > Nit: s/c.no/c.not/ Fixed > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 497: > >> 495: inline void zext_b(Register Rd, Register Rs) { >> 496: if (do_compress_zcb(Rd, Rs) && >> 497: (Rd == Rs)) { > > Nit: Maybe put the two conditions on the same line to be consistent in style with the other two `notr` & `zext_w` in the same file. > `if (do_compress_zcb(Rd, Rs) && (Rd == Rs)) {` Fixed ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1878400415 PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1442683177 PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1442682874 From rehn at openjdk.org Fri Jan 5 09:55:30 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 5 Jan 2024 09:55:30 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v3] In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 09:15:52 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/assembler_riscv.hpp line 542: >> >>> 540: INSN(_lbu, 0b0000011, 0b100); // Zcb >>> 541: INSN(_lh, 0b0000011, 0b001); // Zcb >>> 542: INSN(_lhu, 0b0000011, 0b101); // Zcb >> >> The code comment for these three lines seems a bit misleading. These are normal 4-bytes encoding load/store instructions, not `Zcb` compressed instructions. > > The comment was suppose to mean, these are 'overridden' by zcb, not C as the others. > I'll clarify that. I remove them, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17122#discussion_r1442683028 From vkempik at openjdk.org Fri Jan 5 10:36:24 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 5 Jan 2024 10:36:24 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v5] In-Reply-To: References: Message-ID: <8CEPDnazZKXkn8FmbFw0IzHj6EkORsetG6Nze4Whs3Q=.e2319b5d-fdc6-4cb6-b37e-ebe930e34a33@github.com> On Fri, 5 Jan 2024 09:33:35 GMT, Robbin Ehn wrote: >> Hi, this is the instructions for zcb. >> >> Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. >> Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. >> I think we need to do some rework here. >> >> I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). >> (macro stuff was originally done when templates where blacklisted in hotspot) >> >> And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). >> >> I have done some modification since it passed tier1, so I'm running stuff over the weekend. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes Marked as reviewed by vkempik (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17122#pullrequestreview-1805709077 From thartmann at openjdk.org Fri Jan 5 10:48:25 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 10:48:25 GMT Subject: RFR: 8320310: CompiledMethod::has_monitors flag can be incorrect [v4] In-Reply-To: References: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> Message-ID: On Mon, 11 Dec 2023 18:38:55 GMT, Jorn Vernee wrote: >> Currently, the `CompiledMethod::has_monitors` flag is set when either a `monitorenter` is parsed by C1, and `monitorexit` is parsed by C1 or C2 during method compilation. However, not necessarily every bytecode of a method is parsed, which means that we could miss all `monitorenter`/`monitorexit` byte codes in a method, while it actually does use monitors. This can lead to situations where a thread holds a monitor, but `has_monitors` for all frames is set to `false`, leading to an assertion failure in 'freeze_internal' in continuationFreezeThaw.cpp: >> >> assert(monitors_on_stack(current) == ((current->held_monitor_count() - current->jni_monitor_count()) > 0), >> "Held monitor count and locks on stack invariant: " INT64_FORMAT " JNI: " INT64_FORMAT, (int64_t)current->held_monitor_count(), (int64_t)current->jni_monitor_count()); >> >> The proposed fix is to rely on `Method::has_monitor_bytecodes` to set the `has_monitors` flag when compiling, which is immune to issues where not all byte codes of a method are parsed during compilation. We can follow the pattern established for `has_reserved_stack_access`, which is similar. >> >> Note that this PR is based on: https://github.com/openjdk/jdk/pull/16416 which disables the assertion. The goal of this PR is to fix the issue, and then re-enable the assertion. >> >> Testing: Tier 1-4, `java/lang/Thread/virtual/stress/PinALot.java` > > Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: > > re-enable assert again Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16799#pullrequestreview-1805725501 From shade at openjdk.org Fri Jan 5 11:40:41 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 Jan 2024 11:40:41 GMT Subject: RFR: 8321137: Reconsider ICStub alignment Message-ID: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: - AArch64: 128 -> 64 bytes :) - x86_64: 64 -> 64 bytes :| - x86_32: 32 -> 64 bytes :( - PPC64: 512 -> 128 bytes :)) - S390X: 128 -> 256 bytes :( - ARM: 32 -> 64 bytes :( - Zero: Additional testing: - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` ------------- Commit messages: - Work Changes: https://git.openjdk.org/jdk/pull/17277/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17277&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321137 Stats: 66 lines in 6 files changed: 39 ins; 9 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/17277.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17277/head:pull/17277 PR: https://git.openjdk.org/jdk/pull/17277 From mdoerr at openjdk.org Fri Jan 5 12:13:27 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 5 Jan 2024 12:13:27 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 20 Dec 2023 11:16:03 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > Spaces fix I have tried to build jextract (https://github.com/openjdk/jextract/tree/jdk22) with LLVM (https://github.com/llvm/llvm-project/releases/download/llvmorg-16.0.4/clang+llvm-16.0.4-powerpc64-ibm-aix-7.2.tar.xz). I noticed that llvm mainly consists of .a files. So, I think we need to support that for FFI compatibility with other libraries and open source projects. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1878570142 From thartmann at openjdk.org Fri Jan 5 13:49:33 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 13:49:33 GMT Subject: RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used Message-ID: [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. The backout applies cleanly. Thanks, Tobias ------------- Commit messages: - Revert "8318562: Computational test more than 2x slower when AVX instructions are used" Changes: https://git.openjdk.org/jdk/pull/17279/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17279&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322985 Stats: 247 lines in 4 files changed: 0 ins; 245 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17279.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17279/head:pull/17279 PR: https://git.openjdk.org/jdk/pull/17279 From chagedorn at openjdk.org Fri Jan 5 13:49:34 2024 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Fri, 5 Jan 2024 13:49:34 GMT Subject: RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 13:41:51 GMT, Tobias Hartmann wrote: > [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. > > The backout applies cleanly. > > Thanks, > Tobias Sounds reasonable, backout looks good. ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17279#pullrequestreview-1805997423 From thartmann at openjdk.org Fri Jan 5 13:52:21 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 13:52:21 GMT Subject: RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 13:41:51 GMT, Tobias Hartmann wrote: > [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. > > The backout applies cleanly. > > Thanks, > Tobias Thanks, Christian! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17279#issuecomment-1878688463 From shade at openjdk.org Fri Jan 5 14:13:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 Jan 2024 14:13:21 GMT Subject: RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: <-yJho0qdlbWPLhY9FnPEOueCN2q2G52D0Vlbp7fl6vQ=.a8752874-c235-4661-8bc3-7f2c0f5d72ac@github.com> On Fri, 5 Jan 2024 13:41:51 GMT, Tobias Hartmann wrote: > [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. > > The backout applies cleanly. > > Thanks, > Tobias All right, dang. I would handle 17.0.11 (April 2024) backout. For 21.0.2, we need to coordinate with Rob. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17279#pullrequestreview-1806036267 From thartmann at openjdk.org Fri Jan 5 14:35:22 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 14:35:22 GMT Subject: RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 13:41:51 GMT, Tobias Hartmann wrote: > [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. > > The backout applies cleanly. > > Thanks, > Tobias Thanks Aleksey, I'll reach out to Rob by email. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17279#issuecomment-1878773662 From thartmann at openjdk.org Fri Jan 5 15:43:30 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 15:43:30 GMT Subject: RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: <7TjzO97yGrcUmaHN20VubqCzIOKNz6Zb0GqVsSBd5ms=.8ac9c519-3ea7-419b-88db-ab7d150ee8cb@github.com> On Fri, 5 Jan 2024 13:41:51 GMT, Tobias Hartmann wrote: > [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. > > The backout applies cleanly. > > Thanks, > Tobias Testing is clean. I'm integrating this so that we can prepare the backports asap. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17279#issuecomment-1878866058 From thartmann at openjdk.org Fri Jan 5 15:43:31 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 15:43:31 GMT Subject: Integrated: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 13:41:51 GMT, Tobias Hartmann wrote: > [JDK-8318562](https://bugs.openjdk.org/browse/JDK-8318562) broke implicit null checking for cvt instructions. See JBS for details. Let's back it out for now because this is a potential showstopper for JDK 17.0.11, 21.0.2 and 22 and needs to be backported asap. > > The backout applies cleanly. > > Thanks, > Tobias This pull request has now been integrated. Changeset: ed9f3243 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/ed9f3243f04718a50bbdc589437872f7215c0e08 Stats: 247 lines in 4 files changed: 0 ins; 245 del; 2 mod 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used Reviewed-by: chagedorn, shade ------------- PR: https://git.openjdk.org/jdk/pull/17279 From pchilanomate at openjdk.org Fri Jan 5 15:55:23 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 5 Jan 2024 15:55:23 GMT Subject: [jdk22] RFR: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index In-Reply-To: References: Message-ID: On Thu, 4 Jan 2024 15:22:16 GMT, Patricio Chilano Mateo wrote: > Hi all, > > This pull request contains a backport of commit [e9e694f4](https://github.com/openjdk/jdk/commit/e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Patricio Chilano Mateo on 2 Jan 2024 and was reviewed by Dean Long and Frederic Parain. > > Thanks! Thanks David! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/29#issuecomment-1878882407 From pchilanomate at openjdk.org Fri Jan 5 15:55:24 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 5 Jan 2024 15:55:24 GMT Subject: [jdk22] Integrated: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index In-Reply-To: References: Message-ID: On Thu, 4 Jan 2024 15:22:16 GMT, Patricio Chilano Mateo wrote: > Hi all, > > This pull request contains a backport of commit [e9e694f4](https://github.com/openjdk/jdk/commit/e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Patricio Chilano Mateo on 2 Jan 2024 and was reviewed by Dean Long and Frederic Parain. > > Thanks! This pull request has now been integrated. Changeset: 01cb043b Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk22/commit/01cb043b293e7bec822fb034008ad42b1cb8b481 Stats: 81 lines in 17 files changed: 38 ins; 9 del; 34 mod 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index Reviewed-by: dholmes Backport-of: e9e694f4ef7b080d7fe1ad5b2f2daa2fccd0456e ------------- PR: https://git.openjdk.org/jdk22/pull/29 From thartmann at openjdk.org Fri Jan 5 15:56:41 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 15:56:41 GMT Subject: [jdk22] RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used Message-ID: Hi all, This pull request contains a backport of commit [ed9f3243](https://github.com/openjdk/jdk/commit/ed9f3243f04718a50bbdc589437872f7215c0e08) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Tobias Hartmann on 5 Jan 2024 and was reviewed by Christian Hagedorn and Aleksey Shipilev. Thanks! ------------- Commit messages: - Backport ed9f3243f04718a50bbdc589437872f7215c0e08 Changes: https://git.openjdk.org/jdk22/pull/33/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=33&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322985 Stats: 247 lines in 4 files changed: 0 ins; 245 del; 2 mod Patch: https://git.openjdk.org/jdk22/pull/33.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/33/head:pull/33 PR: https://git.openjdk.org/jdk22/pull/33 From shade at openjdk.org Fri Jan 5 16:04:27 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 Jan 2024 16:04:27 GMT Subject: [jdk22] RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 15:50:39 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed9f3243](https://github.com/openjdk/jdk/commit/ed9f3243f04718a50bbdc589437872f7215c0e08) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 5 Jan 2024 and was reviewed by Christian Hagedorn and Aleksey Shipilev. > > Thanks! Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/33#pullrequestreview-1806229440 From kvn at openjdk.org Fri Jan 5 17:10:27 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 5 Jan 2024 17:10:27 GMT Subject: [jdk22] RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 15:50:39 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed9f3243](https://github.com/openjdk/jdk/commit/ed9f3243f04718a50bbdc589437872f7215c0e08) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 5 Jan 2024 and was reviewed by Christian Hagedorn and Aleksey Shipilev. > > Thanks! Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk22/pull/33#pullrequestreview-1806420238 From thartmann at openjdk.org Fri Jan 5 17:25:23 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 17:25:23 GMT Subject: [jdk22] RFR: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 15:50:39 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed9f3243](https://github.com/openjdk/jdk/commit/ed9f3243f04718a50bbdc589437872f7215c0e08) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 5 Jan 2024 and was reviewed by Christian Hagedorn and Aleksey Shipilev. > > Thanks! Thanks for the reviews, Vladimir and Aleksey. ------------- PR Comment: https://git.openjdk.org/jdk22/pull/33#issuecomment-1879016188 From thartmann at openjdk.org Fri Jan 5 17:34:39 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 5 Jan 2024 17:34:39 GMT Subject: [jdk22] Integrated: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used In-Reply-To: References: Message-ID: <3ggtnSKAiLXQC1cx-NbTL8lxEP_Jb_yg56VWIcVMpA8=.34de9319-16a2-4fd5-8bbb-d139e4612d1f@github.com> On Fri, 5 Jan 2024 15:50:39 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed9f3243](https://github.com/openjdk/jdk/commit/ed9f3243f04718a50bbdc589437872f7215c0e08) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 5 Jan 2024 and was reviewed by Christian Hagedorn and Aleksey Shipilev. > > Thanks! This pull request has now been integrated. Changeset: 28279ee6 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk22/commit/28279ee615b3b6df7428da84365d2310ba814f60 Stats: 247 lines in 4 files changed: 0 ins; 245 del; 2 mod 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used Reviewed-by: shade, kvn Backport-of: ed9f3243f04718a50bbdc589437872f7215c0e08 ------------- PR: https://git.openjdk.org/jdk22/pull/33 From dlong at openjdk.org Sat Jan 6 01:03:25 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 6 Jan 2024 01:03:25 GMT Subject: RFR: 8321137: Reconsider ICStub alignment In-Reply-To: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Fri, 5 Jan 2024 11:32:03 GMT, Aleksey Shipilev wrote: > This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. > > Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. > > Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: > - AArch64: 128 -> 64 bytes :) > - x86_64: 64 -> 64 bytes :| > - x86_32: 32 -> 64 bytes :( > - PPC64: 512 -> 128 bytes :)) > - S390X: 128 -> 256 bytes :( > - ARM: 32 -> 64 bytes :( > - Zero: > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` src/hotspot/share/code/icBuffer.cpp line 229: > 227: p2i(ic_stub), p2i(ic_stub->code_begin()), p2i(rev_stub)); > 228: } > 229: #endif I think this sanity check would fit better in `new_ic_stub`(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17277#discussion_r1443549531 From dlong at openjdk.org Sat Jan 6 01:16:25 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 6 Jan 2024 01:16:25 GMT Subject: RFR: 8321137: Reconsider ICStub alignment In-Reply-To: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Fri, 5 Jan 2024 11:32:03 GMT, Aleksey Shipilev wrote: > This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. > > Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. > > Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: > - AArch64: 128 -> 64 bytes :) > - x86_64: 64 -> 64 bytes :| > - x86_32: 32 -> 64 bytes :( > - PPC64: 512 -> 128 bytes :)) > - S390X: 128 -> 256 bytes :( > - ARM: 32 -> 64 bytes :( > - Zero: > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` Nice job minimizing the changes. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17277#pullrequestreview-1807134218 From bulasevich at openjdk.org Sat Jan 6 14:05:33 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Sat, 6 Jan 2024 14:05:33 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: > The change simplifies the CodeCache::initialize_heaps segment memory split logic while preserving the existing layout: > > if (!non_nmethod_set && !profiled_set && !non_profiled_set) { > ... > } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { > if (non_profiled_set) { > if (!profiled_set) { > ... > } > } else if (profiled_set) { > ... > } else if (non_nmethod_set) { > ... > } > } > > --> > > if (!profiled.set && !non_profiled.set) { > .. > } > if (profiled.set && !non_profiled.set) { > .. > } > if (!profiled.set && non_profiled.set) { > .. > } > if (!non_nmethod.set && profiled.set && non_profiled.set) { > .. > } > > > With this change, PrintFlagsFinal shows the actual segment sizes (not an intermediate value before alignment), and the segments completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: cleanup & test udpdate ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/588f9820..d1415359 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=00-01 Stats: 21 lines in 1 file changed: 9 ins; 8 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From fyang at openjdk.org Sun Jan 7 08:36:27 2024 From: fyang at openjdk.org (Fei Yang) Date: Sun, 7 Jan 2024 08:36:27 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v5] In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 09:33:35 GMT, Robbin Ehn wrote: >> Hi, this is the instructions for zcb. >> >> Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. >> Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. >> I think we need to do some rework here. >> >> I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). >> (macro stuff was originally done when templates where blacklisted in hotspot) >> >> And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). >> >> I have done some modification since it passed tier1, so I'm running stuff over the weekend. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes Updated change looks good. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17122#pullrequestreview-1807762927 From mli at openjdk.org Sun Jan 7 17:32:31 2024 From: mli at openjdk.org (Hamlin Li) Date: Sun, 7 Jan 2024 17:32:31 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v12] In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 09:05:57 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: > > - Merge branch 'master' into sha256 > - Fixed comment > - Fixed flags > - Fixed vlen 128 > - Merge branch 'master' into sha256 > - fixed lmul > - remove merge, renames > - Easier reg layout and 128/m2 > - Minor update > - index store state back > - ... and 11 more: https://git.openjdk.org/jdk/compare/eedb8f98...2442b9c6 Thanks for updating, looks good to me. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1807843750 From kbarrett at openjdk.org Mon Jan 8 01:41:47 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 01:41:47 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code [v2] In-Reply-To: References: Message-ID: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into riscv-wparentheses - simplify frame::equal assert - fix -Wparentheses warnings in riscv code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17216/files - new: https://git.openjdk.org/jdk/pull/17216/files/ef64e2e1..aefd232c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17216&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17216&range=00-01 Stats: 4892 lines in 422 files changed: 2586 ins; 929 del; 1377 mod Patch: https://git.openjdk.org/jdk/pull/17216.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17216/head:pull/17216 PR: https://git.openjdk.org/jdk/pull/17216 From kbarrett at openjdk.org Mon Jan 8 01:41:47 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 01:41:47 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 01:38:27 GMT, Kim Barrett wrote: >> Please review this change to eliminate some -Wparentheses warnings. This >> involved simply adding a few parentheses to make some implicit operator >> precedence explicit. >> >> Testing: Local (linux-x64) cross-build for linux-riscv with this change plus >> -Wparentheses enabled and other changes to allow that to work. >> >> Requesting someone from the riscv porters to properly test this. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into riscv-wparentheses > - simplify frame::equal assert > - fix -Wparentheses warnings in riscv code src/hotspot/cpu/riscv/frame_riscv.inline.hpp line 189: > 187: fp() == other.fp() && > 188: pc() == other.pc(); > 189: assert(!ret || (cb() == other.cb() && _deopt_state == other._deopt_state), "inconsistent construction"); Per comments in reviews for similar changes for other ports, I've removed the unnecessary "ret &&". Builds fine, but should probably be (re)tested by a port maintainer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17216#discussion_r1444116349 From kbarrett at openjdk.org Mon Jan 8 03:34:33 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 03:34:33 GMT Subject: RFR: 8323110: Eliminate -Wparentheses warnings in ppc code Message-ID: Please review this trivial change to eliminate a -Wparentheses warning. This involved simply adding parentheses to make the implicit operator precedence explicit. Testing: Locally (linux-x64) cross-compiled for linux-ppc64le. Also ran GHA with -Wparentheses enabled along with this and other changes needed to make that work. ------------- Commit messages: - fix -Wparentheses warnings in ppc code Changes: https://git.openjdk.org/jdk/pull/17293/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17293&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323110 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17293.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17293/head:pull/17293 PR: https://git.openjdk.org/jdk/pull/17293 From dholmes at openjdk.org Mon Jan 8 04:33:20 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Jan 2024 04:33:20 GMT Subject: RFR: 8323110: Eliminate -Wparentheses warnings in ppc code In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 03:29:36 GMT, Kim Barrett wrote: > Please review this trivial change to eliminate a -Wparentheses warning. > This involved simply adding parentheses to make the implicit operator > precedence explicit. > > Testing: Locally (linux-x64) cross-compiled for linux-ppc64le. Also ran GHA with > -Wparentheses enabled along with this and other changes needed to make that > work. Looks fine and trivial. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17293#pullrequestreview-1808010993 From fyang at openjdk.org Mon Jan 8 07:05:31 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 8 Jan 2024 07:05:31 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v12] In-Reply-To: References: Message-ID: On Fri, 5 Jan 2024 09:05:57 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: > > - Merge branch 'master' into sha256 > - Fixed comment > - Fixed flags > - Fixed vlen 128 > - Merge branch 'master' into sha256 > - fixed lmul > - remove merge, renames > - Easier reg layout and 128/m2 > - Minor update > - index store state back > - ... and 11 more: https://git.openjdk.org/jdk/compare/3cb4c349...2442b9c6 Two minor comments remain, otherwise looks good. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4029: > 4027: if (multi_block) { > 4028: int total_adds = vset_sew == Assembler::e32 ? 240 : 608; > 4029: __ addi(consts, consts, -total_adds); Maybe leave a TODO about future investigation of preloading of constants in vector registers for SHA256? src/hotspot/cpu/riscv/vm_version_riscv.cpp line 269: > 267: FLAG_SET_DEFAULT(UseChaCha20Intrinsics, true); > 268: } > 269: } if (UseChaCha20Intrinsics) { What's the purpose of this change? ------------- PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1807963701 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1444225566 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1444133690 From rehn at openjdk.org Mon Jan 8 07:26:45 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Jan 2024 07:26:45 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v13] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Reverted accidental change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/2442b9c6..fad8fa1d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Mon Jan 8 07:26:50 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Jan 2024 07:26:50 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v12] In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 07:00:49 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: >> >> - Merge branch 'master' into sha256 >> - Fixed comment >> - Fixed flags >> - Fixed vlen 128 >> - Merge branch 'master' into sha256 >> - fixed lmul >> - remove merge, renames >> - Easier reg layout and 128/m2 >> - Minor update >> - index store state back >> - ... and 11 more: https://git.openjdk.org/jdk/compare/dfc5673d...2442b9c6 > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4029: > >> 4027: if (multi_block) { >> 4028: int total_adds = vset_sew == Assembler::e32 ? 240 : 608; >> 4029: __ addi(consts, consts, -total_adds); > > Maybe leave a TODO about future investigation of preloading of constants in vector registers for SHA256? It's recorded here: https://bugs.openjdk.org/browse/JDK-8322177 Note that we can preload SHA512 for vlen 256, requires 20 vregs, also. > src/hotspot/cpu/riscv/vm_version_riscv.cpp line 269: > >> 267: FLAG_SET_DEFAULT(UseChaCha20Intrinsics, true); >> 268: } >> 269: } if (UseChaCha20Intrinsics) { > > What's the purpose of this change? Accidental, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1444237280 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1444236670 From rehn at openjdk.org Mon Jan 8 07:27:23 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Jan 2024 07:27:23 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: <36AfaIsKlwYLkXYHg4QFA7c-aSP3Tvvy4amp9Ayg5PQ=.caf6e7bd-fdcb-40ef-a542-76258f646cb7@github.com> References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> <36AfaIsKlwYLkXYHg4QFA7c-aSP3Tvvy4amp9Ayg5PQ=.caf6e7bd-fdcb-40ef-a542-76258f646cb7@github.com> Message-ID: On Tue, 19 Dec 2023 14:25:27 GMT, Vladimir Kempik wrote: >>> > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? >>> > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 >>> > >>> > >>> > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. >>> > So here I just try to follow the current code, see how lw is changed to c_lw. >>> >>> Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu >>> >>> an example >>> >>> ``` >>> 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 >>> 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 >>> .... >>> 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) >>> 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 >>> ``` >>> >>> Using Assembler::lwu directly resulted in a correctly generated lwu >> >> Interesting. This does not seem to reflect on the code of `MacroAssembler's lwu`. I wonder how could that happen. > >> > > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? >> > > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 >> > > >> > > >> > > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. >> > > So here I just try to follow the current code, see how lw is changed to c_lw. >> > >> > >> > Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu >> > an example >> > ``` >> > 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 >> > 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 >> > .... >> > 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) >> > 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 >> > ``` >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > Using Assembler::lwu directly resulted in a correctly generated lwu >> >> Interesting. This does not seem to reflect on the code of `MacroAssembler's lwu`. I wonder how could that happen. > > If you take this PR https://github.com/openjdk/jdk/pull/17046/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR3717 and change explicit Assembler::lwu() to lwu() then you are likely to see this issue Thank you @VladimirKempik @RealFYang ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1880494200 From kbarrett at openjdk.org Mon Jan 8 07:33:51 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 07:33:51 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code [v2] In-Reply-To: References: Message-ID: <58LotIwUWIbqes1AQR6buJvOr2hxgmYlGUE64jQX6Rk=.72f7f8d2-1b21-46f0-9cfa-e66f1f47f03d@github.com> > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - simplify assert per review comments - Merge branch 'master' into aarch64-wparentheses - fix -Wparentheses warnings in aarch64 code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17210/files - new: https://git.openjdk.org/jdk/pull/17210/files/82e75c86..63ce2307 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17210&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17210&range=00-01 Stats: 5001 lines in 430 files changed: 2678 ins; 929 del; 1394 mod Patch: https://git.openjdk.org/jdk/pull/17210.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17210/head:pull/17210 PR: https://git.openjdk.org/jdk/pull/17210 From kbarrett at openjdk.org Mon Jan 8 07:33:51 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 07:33:51 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code [v2] In-Reply-To: References: Message-ID: On Wed, 3 Jan 2024 01:39:11 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - simplify assert per review comments >> - Merge branch 'master' into aarch64-wparentheses >> - fix -Wparentheses warnings in aarch64 code > > LGTM. Thanks Thanks for reviews @dholmes-ora , @stefank , and comment from @shipilev . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17210#issuecomment-1880497443 From kbarrett at openjdk.org Mon Jan 8 07:33:52 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 07:33:52 GMT Subject: RFR: 8322806: Eliminate -Wparentheses warnings in aarch64 code [v2] In-Reply-To: <8vCqhYS1D3S--iJxU6ElX7UdLBNc2Xyy9Y-sLLwt2SY=.31ec51fe-6299-41f7-a4d2-cfcc512c1e2f@github.com> References: <8vCqhYS1D3S--iJxU6ElX7UdLBNc2Xyy9Y-sLLwt2SY=.31ec51fe-6299-41f7-a4d2-cfcc512c1e2f@github.com> Message-ID: On Wed, 3 Jan 2024 13:23:56 GMT, Stefan Karlsson wrote: >> src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 198: >> >>> 196: && fp() == other.fp() >>> 197: && pc() == other.pc(); >>> 198: assert(!ret || (ret && cb() == other.cb() && _deopt_state == other._deopt_state), "inconsistent construction"); >> >> Well... Since `||` is short-cutting, then on the right side of `||`, we can be sure that `!ret` was `false` (shortcut not taken), which means `ret` was `true`, which means `ret &&` is redundant? > > I agree: > https://github.com/openjdk/jdk/pull/17210#pullrequestreview-1800130145 Agreed, and updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17210#discussion_r1444239594 From kbarrett at openjdk.org Mon Jan 8 07:33:53 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 07:33:53 GMT Subject: Integrated: 8322806: Eliminate -Wparentheses warnings in aarch64 code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 02:49:09 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: mach5 tier1 > > Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses > and other changes needed to make that work. This pull request has now been integrated. Changeset: d75d876e Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/d75d876eddfd2e59d9d28c2860fdab4ef3ec3c6b Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8322806: Eliminate -Wparentheses warnings in aarch64 code Reviewed-by: stefank, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17210 From fyang at openjdk.org Mon Jan 8 07:37:26 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 8 Jan 2024 07:37:26 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v13] In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 07:26:45 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Reverted accidental change LGTM. Thanks for your patience. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1808173586 From sspitsyn at openjdk.org Mon Jan 8 08:03:28 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 8 Jan 2024 08:03:28 GMT Subject: RFR: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static Message-ID: The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. One detail to underline is that he intrinsic implementation needs to use the argument #0 instead of #1. Testing: - The mach5 tiers 1-6 show no regressions ------------- Commit messages: - 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static Changes: https://git.openjdk.org/jdk/pull/17298/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17298&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322744 Stats: 15 lines in 5 files changed: 0 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/17298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17298/head:pull/17298 PR: https://git.openjdk.org/jdk/pull/17298 From rehn at openjdk.org Mon Jan 8 08:15:27 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Jan 2024 08:15:27 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v13] In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 07:34:47 GMT, Fei Yang wrote: > LGTM. Thanks for your patience. My fault screwing up the vlen 128 testing.... Thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1880541082 From rehn at openjdk.org Mon Jan 8 08:15:29 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Jan 2024 08:15:29 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v12] In-Reply-To: References: Message-ID: On Sun, 7 Jan 2024 17:29:24 GMT, Hamlin Li wrote: > Thanks for updating, looks good to me. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1880541551 From kbarrett at openjdk.org Mon Jan 8 09:04:30 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 09:04:30 GMT Subject: RFR: 8323110: Eliminate -Wparentheses warnings in ppc code In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 04:30:50 GMT, David Holmes wrote: >> Please review this trivial change to eliminate a -Wparentheses warning. >> This involved simply adding parentheses to make the implicit operator >> precedence explicit. >> >> Testing: Locally (linux-x64) cross-compiled for linux-ppc64le. Also ran GHA with >> -Wparentheses enabled along with this and other changes needed to make that >> work. > > Looks fine and trivial. > > Thanks Thanks for the review @dholmes-ora . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17293#issuecomment-1880603790 From kbarrett at openjdk.org Mon Jan 8 09:04:32 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 09:04:32 GMT Subject: Integrated: 8323110: Eliminate -Wparentheses warnings in ppc code In-Reply-To: References: Message-ID: <9YkvLiRYcHlgpAPv8iEkHEYp37T0xqLrJNpeDgw24RE=.121818a6-d84e-4e4d-9caa-0e850c1d2fe3@github.com> On Mon, 8 Jan 2024 03:29:36 GMT, Kim Barrett wrote: > Please review this trivial change to eliminate a -Wparentheses warning. > This involved simply adding parentheses to make the implicit operator > precedence explicit. > > Testing: Locally (linux-x64) cross-compiled for linux-ppc64le. Also ran GHA with > -Wparentheses enabled along with this and other changes needed to make that > work. This pull request has now been integrated. Changeset: a40d397d Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/a40d397d5d785d29a2d5e848f872d11dab3bf80c Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8323110: Eliminate -Wparentheses warnings in ppc code Reviewed-by: dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17293 From kbarrett at openjdk.org Mon Jan 8 09:34:31 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 8 Jan 2024 09:34:31 GMT Subject: RFR: 8322880: Eliminate -Wparentheses warnings in arm32 code Message-ID: Please review this change to eliminate some -Wparentheses warnings. In most cases, this involved simply adding a few parentheses to make some implicit operator precedence explicit. Exceptions are: In the clear_array instruct, removed extraneous parens in a declaration: `Label(loop);` => `Label loop;` In NativeMovConstReg::set_data, changed `&` => `&&`. This is conceptually a bug fix, but the old code "accidentally" worked. Testing: Local (linux-x64) cross-build for linux-arm32. Also ran GHA with -Wparentheses enabled along with this and other changes needed to make that work. ------------- Commit messages: - Fix -Wparentheses warnings in arm32 code Changes: https://git.openjdk.org/jdk/pull/17300/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17300&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322880 Stats: 13 lines in 6 files changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/17300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17300/head:pull/17300 PR: https://git.openjdk.org/jdk/pull/17300 From shade at openjdk.org Mon Jan 8 09:47:36 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jan 2024 09:47:36 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> > This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. > > Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. > > Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: > - AArch64: 128 -> 64 bytes :) > - x86_64: 64 -> 64 bytes :| > - x86_32: 32 -> 64 bytes :( > - PPC64: 512 -> 128 bytes :)) > - S390X: 128 -> 256 bytes :( > - ARM: 32 -> 64 bytes :( > - Zero: > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Inline new_ic_stub ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17277/files - new: https://git.openjdk.org/jdk/pull/17277/files/c5c6398d..b0875b50 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17277&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17277&range=00-01 Stats: 8 lines in 2 files changed: 0 ins; 7 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17277.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17277/head:pull/17277 PR: https://git.openjdk.org/jdk/pull/17277 From shade at openjdk.org Mon Jan 8 09:47:40 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jan 2024 09:47:40 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Sat, 6 Jan 2024 01:00:58 GMT, Dean Long wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Inline new_ic_stub > > src/hotspot/share/code/icBuffer.cpp line 229: > >> 227: p2i(ic_stub), p2i(ic_stub->code_begin()), p2i(rev_stub)); >> 228: } >> 229: #endif > > I think this sanity check would fit better in `new_ic_stub`(). Problem is, `new_ic_stub` can return null on out of memory, so we would need to check that. But I think `new_ic_stub` does not carry its weight, so I just inlined it in new commit, which looks like a good middle ground? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17277#discussion_r1444357650 From shade at openjdk.org Mon Jan 8 09:55:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jan 2024 09:55:21 GMT Subject: RFR: 8322880: Eliminate -Wparentheses warnings in arm32 code In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 09:29:38 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. In most > cases, this involved simply adding a few parentheses to make some implicit > operator precedence explicit. Exceptions are: > > In the clear_array instruct, removed extraneous parens in a declaration: > `Label(loop);` => `Label loop;` > > In NativeMovConstReg::set_data, changed `&` => `&&`. This is conceptually a > bug fix, but the old code "accidentally" worked. > > Testing: Local (linux-x64) cross-build for linux-arm32. Also ran GHA with > -Wparentheses enabled along with this and other changes needed to make that > work. Looks reasonable. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17300#pullrequestreview-1808559977 From lkorinth at openjdk.org Mon Jan 8 13:04:24 2024 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 8 Jan 2024 13:04:24 GMT Subject: RFR: 8320750: Allow a testcase to run with muliple -Xlog In-Reply-To: References: Message-ID: <-COAVWQBC7cpv-FJfMf9DJIOL8eVP-ClAXGR1uZF2d4=.0bfef5c2-e150-49d3-9eab-b40eb0ec57c5@github.com> On Mon, 27 Nov 2023 13:32:52 GMT, Leo Korinth wrote: > Running a testcase with muliple -Xlog crashes JTREG test cases. This is because `Collector.toMap` is not given a merge strategy. > > When the same argument is passed multiple times, I have added a merge strategy to use the latter value. This is similar to how it is implemented for `vm.opt.*` in JTREG. > > If the flag tested is `-Xlog`, replace the value part with a dummy value "NONEMPTY_TEST_SENTINEL". This is because in the case of multiple `-Xlog` all values are used, and JTREG does not give a satisfactory way to represent them. This dummy value should make it hard to try to `@require` on specific values by mistake. > > Tested with: > > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINEL" > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINELXXX" > @requires vm.opt.x.Xms == "3g" > > and > > JAVA_OPTIONS=-Xms3g -Xms4g > JAVA_OPTIONS=-Xms4g -Xms3g > JAVA_OPTIONS=-Xlog:gc* -Xlog:gc* > ``` > > Running tier1 Hi again, I would like to resolve this issue in some way, as I am responsible for introducing this problem. I think the proposed fix is alright and gives us a way to `@require` test against non `-XX` flags. If you strongly feel that the feature to test against flags that are not supported by JTREG is unnecessary, I will remove the feature. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16824#issuecomment-1880964703 From stefank at openjdk.org Mon Jan 8 13:14:23 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Jan 2024 13:14:23 GMT Subject: RFR: 8320750: Allow a testcase to run with muliple -Xlog In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 13:32:52 GMT, Leo Korinth wrote: > Running a testcase with muliple -Xlog crashes JTREG test cases. This is because `Collector.toMap` is not given a merge strategy. > > When the same argument is passed multiple times, I have added a merge strategy to use the latter value. This is similar to how it is implemented for `vm.opt.*` in JTREG. > > If the flag tested is `-Xlog`, replace the value part with a dummy value "NONEMPTY_TEST_SENTINEL". This is because in the case of multiple `-Xlog` all values are used, and JTREG does not give a satisfactory way to represent them. This dummy value should make it hard to try to `@require` on specific values by mistake. > > Tested with: > > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINEL" > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINELXXX" > @requires vm.opt.x.Xms == "3g" > > and > > JAVA_OPTIONS=-Xms3g -Xms4g > JAVA_OPTIONS=-Xms4g -Xms3g > JAVA_OPTIONS=-Xlog:gc* -Xlog:gc* > ``` > > Running tier1 Looks OK. It was a little bit awkward to read the code at first, but I think I understand what it does. There are probably ways to structure the code to make it easier to read. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16824#pullrequestreview-1809073084 From gcao at openjdk.org Mon Jan 8 13:35:25 2024 From: gcao at openjdk.org (Gui Cao) Date: Mon, 8 Jan 2024 13:35:25 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code [v2] In-Reply-To: References: Message-ID: <3tjvmEvOecXVN2maLLipQvgJ7rquGvi1Arf3KpbWsxo=.e2f3b790-ebb8-4c81-8301-cbc423f76f62@github.com> On Mon, 8 Jan 2024 01:41:47 GMT, Kim Barrett wrote: >> Please review this change to eliminate some -Wparentheses warnings. This >> involved simply adding a few parentheses to make some implicit operator >> precedence explicit. >> >> Testing: Local (linux-x64) cross-build for linux-riscv with this change plus >> -Wparentheses enabled and other changes to allow that to work. >> >> Requesting someone from the riscv porters to properly test this. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into riscv-wparentheses > - simplify frame::equal assert > - fix -Wparentheses warnings in riscv code Hi, the latest chang still test good with fastdebug build. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17216#issuecomment-1881014349 From jvernee at openjdk.org Mon Jan 8 13:45:31 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 8 Jan 2024 13:45:31 GMT Subject: RFR: 8320310: CompiledMethod::has_monitors flag can be incorrect [v4] In-Reply-To: References: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> Message-ID: On Mon, 11 Dec 2023 18:38:55 GMT, Jorn Vernee wrote: >> Currently, the `CompiledMethod::has_monitors` flag is set when either a `monitorenter` is parsed by C1, and `monitorexit` is parsed by C1 or C2 during method compilation. However, not necessarily every bytecode of a method is parsed, which means that we could miss all `monitorenter`/`monitorexit` byte codes in a method, while it actually does use monitors. This can lead to situations where a thread holds a monitor, but `has_monitors` for all frames is set to `false`, leading to an assertion failure in 'freeze_internal' in continuationFreezeThaw.cpp: >> >> assert(monitors_on_stack(current) == ((current->held_monitor_count() - current->jni_monitor_count()) > 0), >> "Held monitor count and locks on stack invariant: " INT64_FORMAT " JNI: " INT64_FORMAT, (int64_t)current->held_monitor_count(), (int64_t)current->jni_monitor_count()); >> >> The proposed fix is to rely on `Method::has_monitor_bytecodes` to set the `has_monitors` flag when compiling, which is immune to issues where not all byte codes of a method are parsed during compilation. We can follow the pattern established for `has_reserved_stack_access`, which is similar. >> >> Note that this PR is based on: https://github.com/openjdk/jdk/pull/16416 which disables the assertion. The goal of this PR is to fix the issue, and then re-enable the assertion. >> >> Testing: Tier 1-4, `java/lang/Thread/virtual/stress/PinALot.java` > > Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: > > re-enable assert again Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16799#issuecomment-1881031735 From dcubed at openjdk.org Mon Jan 8 14:41:30 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 8 Jan 2024 14:41:30 GMT Subject: RFR: 8320750: Allow a testcase to run with muliple -Xlog In-Reply-To: <-COAVWQBC7cpv-FJfMf9DJIOL8eVP-ClAXGR1uZF2d4=.0bfef5c2-e150-49d3-9eab-b40eb0ec57c5@github.com> References: <-COAVWQBC7cpv-FJfMf9DJIOL8eVP-ClAXGR1uZF2d4=.0bfef5c2-e150-49d3-9eab-b40eb0ec57c5@github.com> Message-ID: On Mon, 8 Jan 2024 13:01:30 GMT, Leo Korinth wrote: >> Running a testcase with muliple -Xlog crashes JTREG test cases. This is because `Collector.toMap` is not given a merge strategy. >> >> When the same argument is passed multiple times, I have added a merge strategy to use the latter value. This is similar to how it is implemented for `vm.opt.*` in JTREG. >> >> If the flag tested is `-Xlog`, replace the value part with a dummy value "NONEMPTY_TEST_SENTINEL". This is because in the case of multiple `-Xlog` all values are used, and JTREG does not give a satisfactory way to represent them. This dummy value should make it hard to try to `@require` on specific values by mistake. >> >> Tested with: >> >> @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINEL" >> @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINELXXX" >> @requires vm.opt.x.Xms == "3g" >> >> and >> >> JAVA_OPTIONS=-Xms3g -Xms4g >> JAVA_OPTIONS=-Xms4g -Xms3g >> JAVA_OPTIONS=-Xlog:gc* -Xlog:gc* >> ``` >> >> Running tier1 > > Hi again, I would like to resolve this issue in some way, as I am responsible for introducing this problem. I think the proposed fix is alright and gives us a way to `@require` test against non `-XX` flags. If you strongly feel that the feature to test against flags that are not supported by JTREG is unnecessary, I will remove the feature. @lkorinth - I fixed the typo in the bug's synopsis. You'll need to adjust the PR's title. The easiest way is to use "/issue JDK-8320750". ------------- PR Comment: https://git.openjdk.org/jdk/pull/16824#issuecomment-1881135144 From jvernee at openjdk.org Mon Jan 8 14:58:33 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 8 Jan 2024 14:58:33 GMT Subject: Integrated: 8320310: CompiledMethod::has_monitors flag can be incorrect In-Reply-To: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> References: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> Message-ID: On Thu, 23 Nov 2023 15:55:07 GMT, Jorn Vernee wrote: > Currently, the `CompiledMethod::has_monitors` flag is set when either a `monitorenter` is parsed by C1, and `monitorexit` is parsed by C1 or C2 during method compilation. However, not necessarily every bytecode of a method is parsed, which means that we could miss all `monitorenter`/`monitorexit` byte codes in a method, while it actually does use monitors. This can lead to situations where a thread holds a monitor, but `has_monitors` for all frames is set to `false`, leading to an assertion failure in 'freeze_internal' in continuationFreezeThaw.cpp: > > assert(monitors_on_stack(current) == ((current->held_monitor_count() - current->jni_monitor_count()) > 0), > "Held monitor count and locks on stack invariant: " INT64_FORMAT " JNI: " INT64_FORMAT, (int64_t)current->held_monitor_count(), (int64_t)current->jni_monitor_count()); > > The proposed fix is to rely on `Method::has_monitor_bytecodes` to set the `has_monitors` flag when compiling, which is immune to issues where not all byte codes of a method are parsed during compilation. We can follow the pattern established for `has_reserved_stack_access`, which is similar. > > Note that this PR is based on: https://github.com/openjdk/jdk/pull/16416 which disables the assertion. The goal of this PR is to fix the issue, and then re-enable the assertion. > > Testing: Tier 1-4, `java/lang/Thread/virtual/stress/PinALot.java` This pull request has now been integrated. Changeset: c8fa3e21 Author: Jorn Vernee URL: https://git.openjdk.org/jdk/commit/c8fa3e21e6a4fd7846932b545a1748cc1dc6d9f1 Stats: 48 lines in 5 files changed: 9 ins; 17 del; 22 mod 8320310: CompiledMethod::has_monitors flag can be incorrect Reviewed-by: vlivanov, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/16799 From ayang at openjdk.org Mon Jan 8 16:44:23 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 Jan 2024 16:44:23 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved In-Reply-To: References: Message-ID: <9L5S-mj1VlGMSrAvwoI2H8pLiFIfPwVEXZRzOm538sQ=.3a015f57-b7d1-4c6e-b1d9-df7c9f83033a@github.com> On Tue, 19 Dec 2023 16:20:07 GMT, Roman Kennke wrote: > The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. > The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. > > Testing: > - [x] hotspot_gc > - [x] tier1 > - [ ] tier2 src/hotspot/share/gc/g1/g1FullGCCompactionPoint.cpp line 106: > 104: // Store a forwarding pointer if the object should be moved. > 105: if (cast_from_oop(object) != _compaction_top) { > 106: preserved_stack()->push_if_necessary(object, object->mark()); Can this be made conditionally on whether the markword is NOT marked/forwarded? (IOW, move the added predicate in `markWord` here.) The rationale is to minimize changes to the shared code for G1 specific usages. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17159#discussion_r1444987070 From shade at openjdk.org Mon Jan 8 19:32:47 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jan 2024 19:32:47 GMT Subject: RFR: 8316180: Thread-local backoff for secondary_super_cache updates [v14] In-Reply-To: References: Message-ID: > See more details in the bug and related issues. > > This is the attempt to mitigate [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450), while the more complex fix that would obviate the need for `secondary_super_cache` is being worked out. The goal for this fix is to improve performance in pathological cases, while keeping non-pathological cases out of extra risk, *and* staying simple enough and reliable for backports to currently supported JDK releases. > > This implements mitigation on most current architectures: > - ? x86_64: implemented > - ? x86_32: considered, abandoned; cannot be easily done without blowing up code size > - ? AArch64: implemented > - ? ARM32: considered, abandoned; needs cleanups and testing; see [JDK-8318414](https://bugs.openjdk.org/browse/JDK-8318414) > - ? PPC64: implemented, thanks @TheRealMDoerr > - ? S390: implemented, thanks @offamitkumar > - ? RISC-V: implemented, thanks @RealFYang > - ? Zero: does not need implementation > > Note that the code is supposed to be rather compact, because it is inlined in generated code. That is why, for example, we cannot easily do x86_32 version: we need a thread, so the easiest way would be to call into VM. But we cannot that easily: the code blowout would make some forward branches in external code non-short. I think we we cannot implement this mitigation on some architectures, so be it, it would be a sensible tradeoff for simplicity. > > Setting backoff at `0` effectively disables the mitigation, and gives us safety hatch if something goes wrong. > > I believe we can go in with `1000` as the default, given the experimental results mentioned in this PR. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1 tier2 tier3` > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: - Merge branch 'master' into JDK-8316180-backoff-secondary-super - AArch64: Trying to backoff only for actual back-to-back updates - Merge branch 'master' into JDK-8316180-backoff-secondary-super - Improve benchmarks - Merge branch 'master' into JDK-8316180-backoff-secondary-super - Editorial cleanups - RISC-V implementation - Mention ARM32 bug - Make sure benchmark runs with C1 - Merge branch 'master' into JDK-8316180-backoff-secondary-super - ... and 21 more: https://git.openjdk.org/jdk/compare/387828a3...d4254b67 ------------- Changes: https://git.openjdk.org/jdk/pull/15718/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15718&range=13 Stats: 411 lines in 18 files changed: 400 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/15718.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15718/head:pull/15718 PR: https://git.openjdk.org/jdk/pull/15718 From dlong at openjdk.org Mon Jan 8 20:00:23 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Jan 2024 20:00:23 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: On Mon, 8 Jan 2024 09:47:36 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - S390X: 128 -> 256 bytes :( >> - ARM: 32 -> 64 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Inline new_ic_stub Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17277#pullrequestreview-1809924454 From dlong at openjdk.org Mon Jan 8 20:00:25 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Jan 2024 20:00:25 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Mon, 8 Jan 2024 09:43:56 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/code/icBuffer.cpp line 229: >> >>> 227: p2i(ic_stub), p2i(ic_stub->code_begin()), p2i(rev_stub)); >>> 228: } >>> 229: #endif >> >> I think this sanity check would fit better in `new_ic_stub`(). > > Problem is, `new_ic_stub` can return null on out of memory, so we would need to check that. But I think `new_ic_stub` does not carry its weight, so I just inlined it in new commit, which looks like a good middle ground? Sure, that's fine. I still feel like from_destination_address sanity checks should not be in the caller, but should be done at ICStub creation time or when code_begin() is called, but it's not a big deal I guess if ICStub can go away soon. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17277#discussion_r1445262227 From rkennke at openjdk.org Mon Jan 8 20:42:34 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 Jan 2024 20:42:34 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved [v2] In-Reply-To: References: Message-ID: > The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. > The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. > > Testing: > - [x] hotspot_gc > - [x] tier1 > - [ ] tier2 Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Move assert to GC-specific code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17159/files - new: https://git.openjdk.org/jdk/pull/17159/files/51e89e9c..32625570 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17159&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17159&range=00-01 Stats: 2 lines in 2 files changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17159.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17159/head:pull/17159 PR: https://git.openjdk.org/jdk/pull/17159 From sgibbons at openjdk.org Mon Jan 8 20:48:39 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 8 Jan 2024 20:48:39 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v6] In-Reply-To: References: Message-ID: > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Merge branch 'openjdk:master' into indexof - Addressing review comments. - Fix for JDK-8321599 - Support UU IndexOf - Only use optimization when EnableX86ECoreOpts is true - Fix whitespace - Merge branch 'openjdk:master' into indexof - Comments; added exhaustive-ish test - Subtracting 0x10 twice. - Stomped on r13 in switch branch calculation - ... and 11 more: https://git.openjdk.org/jdk/compare/8a4dc79e...600377b0 ------------- Changes: https://git.openjdk.org/jdk/pull/16753/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=05 Stats: 3060 lines in 14 files changed: 2918 ins; 7 del; 135 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From duke at openjdk.org Mon Jan 8 21:27:42 2024 From: duke at openjdk.org (duke) Date: Mon, 8 Jan 2024 21:27:42 GMT Subject: Withdrawn: 8319117: GrowableArray: Allow for custom initializer instead of copy constructor In-Reply-To: <7640OMFYd1jbL0RFjUqQvWPekCmULEv5fQS4zHS099k=.be32fc60-8b78-4e0a-bfc3-2de75b6769f1@github.com> References: <7640OMFYd1jbL0RFjUqQvWPekCmULEv5fQS4zHS099k=.be32fc60-8b78-4e0a-bfc3-2de75b6769f1@github.com> Message-ID: On Sun, 29 Oct 2023 14:00:25 GMT, Johan Sj?len wrote: > Hi, > > When using at_put and at_put_grow you can provide a value which will be supplied to the constructor of each element. In other words, you can intialize each element through a copy constructor. > > I suggest that we also provide a function equivalent where the function is provided a pointer to the memory to be initialized. This can be used for `NONCOPYABLE` classes, for example. > > This is implemented using a SFINAE pattern because `nullptr` introduces ambiguity if you use static overload. > > Currently running tier1-tier4. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16409 From ysr at openjdk.org Tue Jan 9 02:26:37 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 9 Jan 2024 02:26:37 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v3] In-Reply-To: References: Message-ID: <8wscUSj6QlV5NJ8AGUS5d3uoDu8yOnyUJixTfaWBymQ=.46df0c47-8a04-4d45-babf-46e4ce8b1236@github.com> On Thu, 1 Jun 2023 08:15:44 GMT, Doug Simon wrote: > Thanks for input David. I agree that it's best to open a new JBS issue to discuss concerns about lazy JVMCI compiler initialization. @dougxc : was the "new JBS issue" opened? If so, can it be linked here and in https://bugs.openjdk.org/browse/JDK-8309136. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14231#issuecomment-1882208593 From dnsimon at openjdk.org Tue Jan 9 04:14:33 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 9 Jan 2024 04:14:33 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v3] In-Reply-To: <8wscUSj6QlV5NJ8AGUS5d3uoDu8yOnyUJixTfaWBymQ=.46df0c47-8a04-4d45-babf-46e4ce8b1236@github.com> References: <8wscUSj6QlV5NJ8AGUS5d3uoDu8yOnyUJixTfaWBymQ=.46df0c47-8a04-4d45-babf-46e4ce8b1236@github.com> Message-ID: On Tue, 9 Jan 2024 02:23:36 GMT, Y. Srinivas Ramakrishna wrote: >> Thanks for input David. I agree that it's best to open a new JBS issue to discuss concerns about lazy JVMCI compiler initialization. > >> Thanks for input David. I agree that it's best to open a new JBS issue to discuss concerns about lazy JVMCI compiler initialization. > > @dougxc : was the "new JBS issue" opened? If so, can it be linked here and in https://bugs.openjdk.org/browse/JDK-8309136. @ysramakrishna I'm not aware of an issue being opened. I'd prefer to avoid opening an issue until we better understand where the concern results in a real problem in practice. If you have something along these lines, feel free to open the issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14231#issuecomment-1882384637 From duke at openjdk.org Tue Jan 9 06:12:50 2024 From: duke at openjdk.org (xtf2009) Date: Tue, 9 Jan 2024 06:12:50 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Wed, 5 Jul 2023 17:28:15 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: > > - fix windows build > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Remove adaptive stepdown coding > - Merge master > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a any chance this feature can backport to jdk11 and 17? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1882344574 From rehn at openjdk.org Tue Jan 9 07:29:35 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 9 Jan 2024 07:29:35 GMT Subject: Integrated: 8319716: RISC-V: Add SHA-2 In-Reply-To: References: Message-ID: On Wed, 8 Nov 2023 14:47:07 GMT, Robbin Ehn wrote: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. This pull request has now been integrated. Changeset: 4cf131a1 Author: Ludovic Henry Committer: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/4cf131a101d13699b1bf017895798c9bda87f551 Stats: 532 lines in 5 files changed: 513 ins; 16 del; 3 mod 8319716: RISC-V: Add SHA-2 Co-authored-by: Robbin Ehn Reviewed-by: fyang, mli, luhenry ------------- PR: https://git.openjdk.org/jdk/pull/16562 From stuefe at openjdk.org Tue Jan 9 07:30:43 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 9 Jan 2024 07:30:43 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Tue, 9 Jan 2024 03:33:39 GMT, xtf2009 wrote: > any chance this feature can backport to jdk11 and 17? This feature (different PR, see https://github.com/openjdk/jdk/pull/14781) has been backported to jdk 17 already. 11, its possible, but not a priority and needs to be negotiated with the maintainers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1882537086 From rehn at openjdk.org Tue Jan 9 07:37:27 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 9 Jan 2024 07:37:27 GMT Subject: Integrated: 8320069: RISC-V: Add Zcb instructions In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 13:50:14 GMT, Robbin Ehn wrote: > Hi, this is the instructions for zcb. > > Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. > Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. > I think we need to do some rework here. > > I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). > (macro stuff was originally done when templates where blacklisted in hotspot) > > And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). > > I have done some modification since it passed tier1, so I'm running stuff over the weekend. This pull request has now been integrated. Changeset: 30f93a29 Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/30f93a29c2f677d0279176b89edf2ecdc06b42ca Stats: 316 lines in 5 files changed: 275 ins; 0 del; 41 mod 8320069: RISC-V: Add Zcb instructions Reviewed-by: fyang, vkempik ------------- PR: https://git.openjdk.org/jdk/pull/17122 From dholmes at openjdk.org Tue Jan 9 08:11:25 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Jan 2024 08:11:25 GMT Subject: RFR: 8320750: Allow a testcase to run with muliple -Xlog In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 13:32:52 GMT, Leo Korinth wrote: > Running a testcase with muliple -Xlog crashes JTREG test cases. This is because `Collector.toMap` is not given a merge strategy. > > When the same argument is passed multiple times, I have added a merge strategy to use the latter value. This is similar to how it is implemented for `vm.opt.*` in JTREG. > > If the flag tested is `-Xlog`, replace the value part with a dummy value "NONEMPTY_TEST_SENTINEL". This is because in the case of multiple `-Xlog` all values are used, and JTREG does not give a satisfactory way to represent them. This dummy value should make it hard to try to `@require` on specific values by mistake. > > Tested with: > > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINEL" > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINELXXX" > @requires vm.opt.x.Xms == "3g" > > and > > JAVA_OPTIONS=-Xms3g -Xms4g > JAVA_OPTIONS=-Xms4g -Xms3g > JAVA_OPTIONS=-Xlog:gc* -Xlog:gc* > ``` > > Running tier1 I'm okay with fixing the bug that was introduced, just so we don't have this crash potential, though I dislike the special handling of `-Xlog` in the code. But overall I don't think this `vm.opt.x.flag` is really necessary as per earlier comments. As I stated earlier I won't actually hit the Approve button because I don't understand the Java code involved. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16824#issuecomment-1882583600 From dholmes at openjdk.org Tue Jan 9 08:23:25 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Jan 2024 08:23:25 GMT Subject: RFR: 8322880: Eliminate -Wparentheses warnings in arm32 code In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 09:29:38 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. In most > cases, this involved simply adding a few parentheses to make some implicit > operator precedence explicit. Exceptions are: > > In the clear_array instruct, removed extraneous parens in a declaration: > `Label(loop);` => `Label loop;` > > In NativeMovConstReg::set_data, changed `&` => `&&`. This is conceptually a > bug fix, but the old code "accidentally" worked. > > Testing: Local (linux-x64) cross-build for linux-arm32. Also ran GHA with > -Wparentheses enabled along with this and other changes needed to make that > work. Looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17300#pullrequestreview-1810617098 From qpzhang at openjdk.org Tue Jan 9 10:29:31 2024 From: qpzhang at openjdk.org (Patrick Zhang) Date: Tue, 9 Jan 2024 10:29:31 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v17] In-Reply-To: References: <-yGcrNxBa91rrdyLb4zNbgz_VRuht7MXBpnel_-WWxg=.6eec01fb-03e7-42d4-b07c-d5617f34bdc2@github.com> Message-ID: On Thu, 28 Dec 2023 20:17:26 GMT, Kim Barrett wrote: >> test/hotspot/gtest/runtime/test_os_linux.cpp line 377: >> >>> 375: EXPECT_TRUE(os::release_memory(heap, 1 * G)); >>> 376: UseTransparentHugePages = useThp; >>> 377: } >> >> This seems like it's concurrently running `madvise(..., MADV_POPULATE_WRITE)`, correct? This is not what I meant. >> >> What I meant was having at least 2 threads, where one thread is running `os::pretouch_memory` and another using the memory for something. For example, 1 thread pretouching, the other thread filling out the memory with an incrementing integer array `[0,1,2,3,4,...]`. I think this is what Kim meant also, or am I the one misunderstanding him? > > [Sorry, I lost track of this and didn't respond to the earlier comment from > @jdksjolen.] > > Yes, that's correct. The reason for adding the safe for concurrent use > pretouch mechanism was https://bugs.openjdk.org/browse/JDK-8260332. > > The idea is that presently, when a thread needs to expand the oldgen, it > pretouches while holding the expansion lock. Any other threads that also need > need the oldgen to be expanded have to wait until the holder of that lock > completes. Most of the work involved in expansion is quick and short, but not > so much for pretouching. So it was found that we're sometimes blocking a > bunch of threads for a long-ish time. > > The original proposal there was to allow the otherwise waiting threads to > cooperate in the pretouch. But the protocol involved was complicated and > messy. A simpler approach was suggested; allow other threads to use the newly > expanded memory concurrently with the expanding thread doing the pretouch. > There's obviously some racing there, with the using threads possibly touching > pages before the pretouching reaches them, but the thinking is that the > pretouched wave-front will likely surge ahead of the using threads. And if > not, then the using threads are effectively cooperating in the "pretouch". > > That approach needed https://bugs.openjdk.org/browse/JDK-8272807 as a building > block. > > But I discovered there were a bunch of places with similar problems, > suggesting the need for some more general mechanism. I did a bit of > prototyping in that direction, but got distracted by other work and haven't > gotten back to it. (The idea is to record needed pretouching, deferring it up > the call chain, to a point where other threads are not being blocked waiting > for the expansion operation. A complicating factor is that some of those > places may have multiple distinct memory ranges being allocated and needing > pretouch, all within the same expansion operation.) > > But that approach may interact poorly with the madvise approach. It might be > that the madvise _should_ be done down inside the expansion operation where > the pretouches currently happen, rather than being deferred up the call chain > and permitting the madvise to be concurrent with using threads that might > introduce the same "shredding" problem the madvise is attempting to fix. That > would be yet another complicating factor that my prototyping didn't address at > all. @limingliu-ampere 's original test was with JVM flags like: `-Xmx24g -Xms24g -Xmn22g -XX:+UseParallelGC -XX:+AlwaysPreTouch -XX:+TransparentHugePages -XX:-UseAdaptiveSizePolicy` etc. Having `-XX:-UseAdaptiveSizePolicy` ensures that `heap->resize_old_gen(size_policy->calculated_old_free_size_in_bytes());` inside `PSParallelCompact::invoke_no_policy` will not be called in this test (see [psParallelCompact.cpp#L1855](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/parallel/psParallelCompact.cpp#L1855)), then it would **not** run into the concern of allocate/expand/pretouch cooperating case on the oldgen, mentioned above by @kimbarrett. With regards to `that approach may interact poorly with the madvise approach`, all pretouching triggered by `PSOldGen::expand(size_t bytes)` are currently wrapped by `MutexLocker x(PSOldGenExpand_lock)`. From this viewpoint, the _madvise_ approach does the pretouching work at the same situation as the original _atomic-add-0_ approach. The proposed patch does not make the potential "shredding" problem on the expansion of oldgen worse. Furthermore, back to the table @limingliu-ampere attached at the initial part of this PR, on Kernel 6.1, with `-XX:+TransparentHugePages`, the _madvise_ approach speeds up the pretouching operation from _atomic-add-0_'s **3.54s** to **0.33s**, which can be an obvious optimization in a manner. The initial purpose of this patch was to solve an outstanding performance issue on some commercial benchmarks, especially when running with huge heaps, for example, `-Xms200g`, or `-Xms400g`, together with `-XX:+UseParallelGC -XX:+AlwaysPreTouch -XX:+TransparentHugePages -XX:-UseAdaptiveSizePolicy`. The performance regression got well resolved by the patch, and the improvement vs baseline was up to 30%. All in all, I think this 4-month-old PR is a positive change, and solved a practical performance problem effectively. @limingliu-ampere will soon have a update to answer @jdksjolen 's question: `2 threads, where one thread is running os::pretouch_memory and another using the memory for something`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1445905192 From adinn at openjdk.org Tue Jan 9 12:11:56 2024 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 9 Jan 2024 12:11:56 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v5] In-Reply-To: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> References: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> Message-ID: On Mon, 4 Dec 2023 17:33:00 GMT, Andrew Haley wrote: >> Vectorizing Poly1305 is quite tricky. We already have a highly- >> efficient scalar Poly1305 implementation that runs on the core integer >> unit, but it's highly serialized, so it does not make make good use of >> the parallelism available. >> >> The scalar implementation takes advantage of some particular features >> of the Poly1305 keys. In particular, certain bits of r, the secret >> key, are required to be 0. These make it possible to use a full >> 64-bit-wide multiply-accumulate operation without needing to process >> carries between partial products, >> >> While this works well for a serial implementation, a parallel >> implementation cannot do this because rather than multiplying by r, >> each step multiplies by some integer power of r, modulo >> 2^130-5. >> >> In order to avoid processing carries between partial products we use a >> redundant representation, in which each 130-bit integer is encoded >> either as a 5-digit integer in base 2^26 or as a 3-digit integer in >> base 2^52, depending on whether we are using a 64- or 32-bit >> multiply-accumulate. >> >> In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate >> operation available to us, so we must use 32*32 -> 64-bit operations. >> >> In order to achieve maximum performance we'd like to get close to the >> processor's decode bandwidth, so that every clock cycle does something >> useful. In a typical high-end AArch64 implementation, the core integer >> unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a >> fast(ish) two-way 32-bit multiplier, which may be slower than than the >> core integer unit's. It is not at all obvious whether it's best to use >> ASIMD or core instructions. >> >> Fortunately, if we have a wide-bandwidth instruction decode, we can do >> both at the same time, by feeding alternating instructions to the core >> and the ASIMD units. This also allows us to make good use of all of >> the available core and ASIMD registers, in parallel. >> >> To do this we use generators, which here are a kind of iterator that >> emits a group of instructions each time it is called. In this case we >> 4 parallel generators, and by calling them alternately we interleave >> the ASIMD and the core instructions. We also take care to ensure that >> each generator finishes at about the same time, to maximize the >> distance between instructions which generate and consume data. >> >> The results are pretty good, ranging from 2* - 3* speedup. It is >> possible that a pure in-order processor (Raspberry Pi?) migh... > > Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: > > - Whitespace > - Whitespace This is an outstanding piece of work which, setting aside the pleasure afforded by its beauty, achieves some highly important goals, both immediate and long-term. The base achievement is to implement the poly1305 algorithm with an intrinsic which drives both the vector and integer units in combination, maximising pipeline parallelism while also profiting from vector (2-way) SIMD parallelism. This design enables high end, out of order processors like Apple's M-series to attain close to 6 instructions per cycle. However, the implementation achieves some much more important goals. A further achievement is to generate this highly efficient code using a generation strategy that renders its correctness ascertainable by direct review of the methods employed in the generator code, rather than by resort to eyeballing the highly complex, interleaved streams of parallel instructions that they generate. The third and, perhaps, most significant achievement is to achieve that goal by implementing the generator using a toolkit which simplifies handling of the many of complexities involved in structuring and interleaving the generated instruction sequences and managing the independent and shared register sets those instructions employ. This last goal is arguably the most important one as it presents a paradigm and for how to generate highly efficient, correct parallel code. It's importance is that the same technique and technology might be retrofitted to other intrinsics with great benefits for maintainability, reliability and confidence in the correctness of the code. The generator toolkit and generation code appears to be entirely correct and is mostly very clean, suffering only from a few format details and one or two now redundant methods that appear to be hangovers from earlier vesions. However the code is missing documentation comments that will be critical to ensure that maintainers can quickly understand what the generated code is doing. The generator code itself needs some commenting to clarify how it is used and how it operates. I have made a few suggestions to that end. The largest omission is commenting of 1) the data layouts used to manage 130-bit data values and 2) the purpose and operation of the various macro-functions that generate smaller and larger instructions sequences. I have likewise suggested comments to clarify these parts of the patch. src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1745: > 1743: poly1305_multiply(acc, u, s, r, RR2, scratch); > 1744: acc.gen(); > 1745: } Comment /* * Appends instructions to the current code buffer implementing * a vector parallel 2-way SIMD widening 130-bit multiply * u <--(s*r) mod 2^130-5 by calling * poly1305_multiply_vec(acc, u, m, r, rr, scratch) * acc.gen(); */ Also, change m[] to s[] in the method signature src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1752: > 1750: acc.gen(); > 1751: } > 1752: Comment /* * Appends instructions to the generator which: * * load two 128-bit values from input_start into s[0], s[1] and * s[2] in vec_4s3_26 format. * * set bit 24 of each DWORD in s[2] to 1. * * add each of the 26 bit limbs of the two values passed in u[] in * vec_2d5_26 format to the corresponding limbs and values of s[] * in vec_4s3_26 format. * * shuffle the values in s from vec_4s3_26 format to vec_4s3_26_I * format. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1755: > 1753: void poly1305_step_vec(AsmGenerator &acc, > 1754: const FloatRegister s[], const FloatRegister u[], > 1755: const FloatRegister zero, Register input_start); The method below appears to be redundant? src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1761: > 1759: const FloatRegister s_v[], > 1760: const FloatRegister r_v[], > 1761: const FloatRegister rr_v[]); Comment /* * Appends instructions to acc that perform a modulo 2^130-5 * Goll-Guerin reduction on the pair of cross-multiplied 130-bit * products presented in u[]. * * u[] inputs the five 26-bit 'digits' and associated 'carry' bits * for two 130-bit cross-products in vec_2d5_26 format (as output * by poly1305_multiply_vec). On return the reduced 130-bit values * are output in u[] in the same format. * * zero is used to zero out bits 26 to 63 of the low and high * DWORDS in u[]. Both the low and high DWORDs of this input * argument must be set to 0 by the caller. * * scratch provides at least two scratch registers that can * be used by the generated code. */ Argument upper_bits should be renamed to zero. src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1765: > 1763: const FloatRegister u[], > 1764: const FloatRegister upper_bits, > 1765: AbstractRegSet scratch); Comment /* * Appends instructions to acc that perform a 130-bit * cross-multiply and reduction by calling * poly1305_multiply_vec(acc, u, s, r, rr) and * poly1305_reduce_vec(acc, u, zero, scratch). */ Also, it would flag what is happening more clearly if this method were renamed poly1305_field_multiply_vec. Obviously, the suffix is redundant -- because the signature clarifies which variant of the two methods woith this name is being called. However, it does no harm to ensure that apples are clearly labelled apples and pairs clearly labelled pairs. src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1776: > 1774: poly1305_reduce_vec(acc, u, zero, scratch); > 1775: } > 1776: Comment /* * Appends instructions to acc which load a 128-bit value from * input_start, stores it as a 130-bit value in s[] in gpr_d3_56 * format and sets bit 24 of s[2] to 1. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1778: > 1776: > 1777: void poly1305_load(AsmGenerator &acc, const Register s[], > 1778: const Register input_start); Comment /* * Appends instructions to the current code buffer to load a * 128-bit value by calling * * poly1305_load(acc, s, input_start); * acc.gen(); */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1783: > 1781: poly1305_load(acc, s, input_start); > 1782: acc.gen(); > 1783: } Comment /* * Appends instructions to acc which load a 128-bit value from * input_start into s and then add it to the value in u[] by * calling * * poly1305_load(acc, s, input_start); * _ { poly1305_add(s, u)); } */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1784: > 1782: acc.gen(); > 1783: } > 1784: void poly1305_step(AsmGenerator &acc, const Register s[], const RegPair u[], const Register input_start); Comment /* * Appends instructions to the current code buffer which load a * 128-bit value from input_start into s and then add it to the * value in u[] by calling * * poly1305_load(acc, s, u, input_start); * acc.gen(); */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1789: > 1787: poly1305_step(acc, s, u, input_start); > 1788: acc.gen(); > 1789: } Comment /* * Appends instructions to the current code buffer which add the * 130-bit value in src to the 130-bit value in dest by calling * * poly1305_add(acc, dest, src); * acc.gen(); */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1790: > 1788: acc.gen(); > 1789: } > 1790: void poly1305_add(const Register dest[], const RegPair src[]); Comment /* * Appends instructions to the current code buffer which add each * of the 3 56-bit limbs in src to the corresponding 56 bit limb * in dest. * * src is a 130-bit value in gpr_d3_56 format. * * dest is a 130-bit value in gpr_d3_56 format. Carry bits may * accumulate in the higher bits of each limb as a result of * successive additions. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1793: > 1791: void poly1305_add(AsmGenerator &acc, > 1792: const Register dest[], const RegPair src[]); > 1793: Method mov26 appears to be redundant src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1794: > 1792: const Register dest[], const RegPair src[]); > 1793: > 1794: void mov26(FloatRegister d, Register s, int lsb); Comment Add comment /* * Split the 56 bit digit passed in r into two low and high 26-bit * digits and insert them, respectively, into the lower and upper * 32-bit half-words of d. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1795: > 1793: > 1794: void mov26(FloatRegister d, Register s, int lsb); > 1795: void expand26(Register d, Register r); Add comment /* * Split the 56 bit digit passed in r into two low and high 26-bit * digits and insert them into the low DWORD of, respectively, * d[0] and d[1]. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1796: > 1794: void mov26(FloatRegister d, Register s, int lsb); > 1795: void expand26(Register d, Register r); > 1796: void split26(const FloatRegister d[], Register s); Add comment /* * Copy a 130-bit value from general purpose registers s0, s1, s2 into * the vector register array d[5]. * * s0, s1 and s2 input a 130-bit value in gpr_d3_56 format. * * d outputs a 130-bit value in vec_d5_26 format. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1798: > 1796: void split26(const FloatRegister d[], Register s); > 1797: void copy_3_to_5_regs(const FloatRegister d[], > 1798: const Register s0, const Register s1, const Register s2); Add comment /* * Copy a 130-bit value from general purpose registers s0, s1, s2 into * the vector register array d[2]. * * s0, s1 and s2 input a 130-bit value in gpr_d3_56 format. * * d outputs a 130-bit value in vec_4s2_26 format. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1801: > 1799: void copy_3_regs_to_5_elements(const FloatRegister d[], > 1800: const Register s0, const Register s1, const Register s2); > 1801: Add comment /* * Appends instructions to acc that perform a modulo 2^130-5 * Goll-Guerin reduction on the cross-multiplied 130-bit presented * in u[]. * * u[] inputs 3 56-bit 'digits' and associated 'carry' bits for a * 130-bit cross-product in gpr_d3_26 format (as output by * poly1305_multiply). On return the reduced 130-bit value is * output via u[] in the same format. */ Also, argument 's' seems to be redundant. Can it be removed? src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1802: > 1800: const Register s0, const Register s1, const Register s2); > 1801: > 1802: void poly1305_reduce(AsmGenerator &acc, const RegPair u[], const char *s = nullptr); Add comment /* * Append instructions to the current code buffer that perform a * modulo 2^130-5 Goll-Guerin reduction on the 130-bit passed in * u[] by calling: * * poly1305_reduce(acc, u, "redc"); * acc.gen(); */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1807: > 1805: poly1305_reduce(acc, u, "redc"); > 1806: acc.gen(); > 1807: } Add comment /* * Appends instructions to acc that add carry bits from s to * d and then zero out the carry bits in s. * * s stores a single limb of a 130-bit value in reg_d5_26 format * comprising a 26-bit 'digit' combined with up to 30 higher * 'carry' bits. * * d stores a single limb of a 130-bit value in reg_d3_52 format * comprising a 26-bit 'digit' combined with up to 30 higher * 'carry' bits. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1809: > 1807: } > 1808: void poly1305_reduce_step(AsmGenerator &acc, > 1809: FloatRegister d, FloatRegister s, FloatRegister upper_bits, FloatRegister scratch); Add comment /* * Appends instructions to acc that reformat a 130-bit value * stored in the low words of u[] in reg_d5_26 format into the * registers passed in dest[] in reg_d3_52 format and clamp the * result to range [0, 2^130-5]. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1810: > 1808: void poly1305_reduce_step(AsmGenerator &acc, > 1809: FloatRegister d, FloatRegister s, FloatRegister upper_bits, FloatRegister scratch); > 1810: void poly1305_fully_reduce(Register dest[], const RegPair u[]); Add comment /* * Appends instructions to acc that transfer five 26-bit 'digits' * of a 130-bit value input in s[] in vec_d5_26 format into three * 52-bit 'digits' output in the low words of u[] in gpr_d3_56 * format. */ Also, the first argument in the declaration is d[] but it is u0[] in the definition. It should probably be u[] but d[] would do. src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1812: > 1810: void poly1305_fully_reduce(Register dest[], const RegPair u[]); > 1811: void poly1305_transfer(const RegPair d[], const FloatRegister s[], > 1812: int lane, FloatRegister vscratch); Add comment /* * Copies a 64-bit value from each of the 3 low registers of src[] * to the corresponding register in dest[]. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1813: > 1811: void poly1305_transfer(const RegPair d[], const FloatRegister s[], > 1812: int lane, FloatRegister vscratch); > 1813: void copy_3_regs(const Register dest[], const Register src[]); Add comment /* * Adds the 64-bit value in each of the 3 low registers of src[] * to the corresponding register in dest[]. */ Add comment /* * Adds the 64-bit value in each of the 3 low registers of src[] * to the corresponding register in dest[]. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 48: > 46: mul(prod._lo, n, m); > 47: umulh(prod._hi, n, m); > 48: } nit: new line needed here src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 55: > 53: } > 54: > 55: void MacroAssembler::poly1305_transfer(const RegPair u0[], This argument should be named u (or possibly d) not u0. src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 105: > 103: ubfx(rscratch1, s, lsb, 26); > 104: mov(d, S, 0, rscratch1); > 105: } nit: new line needed here src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 249: > 247: const FloatRegister s[], > 248: const FloatRegister r[], > 249: const FloatRegister rr[]) { The comment I suggested for the declaration already defines the layout of r and RR. So, this comment might be redundant. src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 252: > 250: // Five limbs of r and rr (5?r) are packed as 32-bit integers into > 251: // two 128-bit vectors. > 252: I'm not sure what the next line is meant to explain. Is it needed? src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 253: > 251: // two 128-bit vectors. > 252: > 253: // // (h + c) * r, without carry propagation The comment below needs to refer to s0, s1 etc rather than m0, m1, etc. src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 300: > 298: trn1(u[1], T4S, u[2], u[3]); > 299: > 300: // The incoming sum is packed into u[0], u[1], u[4] Better to explain the layout change and include full stops. // The incoming sum is packed into u[0], u[1], u[4] in // vecd_4s3_26 format. u[2] and u[3] are now free. src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 322: > 320: sli(s[0], T4S, zero, 26); > 321: }; > 322: Comment // set bit 129 of each value src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 326: > 324: _ { addv(s[2], T2D, s[2], scratch1); }; > 325: _ { sli(s[2], T2D, zero, 32); }; > 326: Comment // add the current sum into the next input src/hotspot/cpu/aarch64/macroAssembler_aarch64_poly1305.cpp line 330: > 328: _ { addv(s[1], T4S, s[1], u[1]); }; > 329: _ { addv(s[2], T4S, s[2], u[4]); }; > 330: Add comment // Interleave the lower and upper pairs of SWORD lanes so // that paired values are now in even and odd SWORDs lanes // i.e. reformat from vec_2d3_26 to vec_2d3_26_I I know this could be left as an exercise for the reader but it's better to spell it out as a reminder for anyone who might need to fix the code so they don't have to spend oo much effort (re-)familiarising. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7300: > 7298: // the ASIMD and the core instructions. We also take care to ensure that > 7299: // each generator finishes at about the same time, to maximize the > 7300: // distance between instructions which generate and consume data. We ought to mention here that the parallelism is six way because the 2 vector instruction streams use 2-way data (SIMD) parallelism. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7331: > 7329: public: > 7330: RegPair _reg_pairs[3]; > 7331: RegPairs(RegSetIterator &it, int n) { Not sure we need n here as it is always passed as 3 in the latest version of the patch. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7366: > 7364: __ pack_26(R[0], R[1], R[2], r_start); > 7365: > 7366: // Sn is to be the sum of Un and the next block of data Should this say // Sn is to be the sum of Un * r and the next block of data src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7392: > 7390: } > 7391: > 7392: // We're going to use R**6 I think this comment is a tad ... minimal! The following would be more helpful // The following instructions implement 6 parallel streams of // computation. Each stream processes input elements separated // by a distance of 6. Hence each stream needs to multiply its // accumulated sum by R**6 before adding the next input value. // Once all 6 partial sums are computed they constitute a // subsequence which is combined using successive multiply // by R and add operations. Likewise, any remaining tail (up to // 5 extra values) is folded in using multiply by R and // add operations. ------------- Changes requested by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16812#pullrequestreview-1750837151 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1442998908 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1444906166 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1444928221 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445009971 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1444981051 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445051387 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445057606 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445065854 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445068969 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445072860 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445082701 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445085266 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445092662 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445095710 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445868262 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445871377 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445885641 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445886003 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445897957 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445928677 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445938026 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445941212 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445942640 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1406413151 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1442032172 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1406413966 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445947233 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445946608 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445946018 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1409086567 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1444448111 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445949487 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445953540 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1442773290 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1442040097 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445964681 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1445973376 From adinn at openjdk.org Tue Jan 9 12:12:00 2024 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 9 Jan 2024 12:12:00 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v2] In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 13:12:32 GMT, Andrew Haley wrote: >> Vectorizing Poly1305 is quite tricky. We already have a highly- >> efficient scalar Poly1305 implementation that runs on the core integer >> unit, but it's highly serialized, so it does not make make good use of >> the parallelism available. >> >> The scalar implementation takes advantage of some particular features >> of the Poly1305 keys. In particular, certain bits of r, the secret >> key, are required to be 0. These make it possible to use a full >> 64-bit-wide multiply-accumulate operation without needing to process >> carries between partial products, >> >> While this works well for a serial implementation, a parallel >> implementation cannot do this because rather than multiplying by r, >> each step multiplies by some integer power of r, modulo >> 2^130-5. >> >> In order to avoid processing carries between partial products we use a >> redundant representation, in which each 130-bit integer is encoded >> either as a 5-digit integer in base 2^26 or as a 3-digit integer in >> base 2^52, depending on whether we are using a 64- or 32-bit >> multiply-accumulate. >> >> In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate >> operation available to us, so we must use 32*32 -> 64-bit operations. >> >> In order to achieve maximum performance we'd like to get close to the >> processor's decode bandwidth, so that every clock cycle does something >> useful. In a typical high-end AArch64 implementation, the core integer >> unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a >> fast(ish) two-way 32-bit multiplier, which may be slower than than the >> core integer unit's. It is not at all obvious whether it's best to use >> ASIMD or core instructions. >> >> Fortunately, if we have a wide-bandwidth instruction decode, we can do >> both at the same time, by feeding alternating instructions to the core >> and the ASIMD units. This also allows us to make good use of all of >> the available core and ASIMD registers, in parallel. >> >> To do this we use generators, which here are a kind of iterator that >> emits a group of instructions each time it is called. In this case we >> 4 parallel generators, and by calling them alternately we interleave >> the ASIMD and the core instructions. We also take care to ensure that >> each generator finishes at about the same time, to maximize the >> distance between instructions which generate and consume data. >> >> The results are pretty good, ranging from 2* - 3* speedup. It is >> possible that a pure in-order processor (Raspberry Pi?) migh... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > remove debug code src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 105: > 103: } > 104: }; > 105: Comment /* * Return an iterator which allows each of the generator blocks (lambdas) * appended to the generator to be individually invoked. Clients which * employ multiple generators can use this method to interleave * instructions belonging to independent instruction streams in the * target code buffer. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 109: > 107: return Iterator(this); > 108: } > 109: Comment /* * Invoke all the generator blocks (lambdas) previously appended to * the generator in order of append, inserting the associated instructions * into the target code buffer. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 118: > 116: > 117: class OopMap; > 118: Comment /* * A RegPair is used to identify a pair of registers which hold the * lower and upper halves of a 128 bit value. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1712: > 1710: void lightweight_unlock(Register obj, Register hdr, Register t1, Register t2, Label& slow); > 1711: > 1712: // Poly1305 I think we need a general comment identifying the various different 130-bit data representations that are employed. The terms introduced here can be used later to describe arguments and intermediary data values employed in the implementation. /* * Poly1305 * * The Poly1305 implementation operates on 130-bit integer values. * These can be encoded in memory or in a sequence of registers * using several different representations: * * Memory layouts (single 130-bit word) * * mem_d5_26: In this representation 130 bits is stored as five * 26-bit 'limbs' in 5 successive memory DWORDs (type long[]). * * This can be considered to be a base 2^26 representation of * the 130 bit value with 5 digits. * * General Purpose Register Data Representations * * gpr_d5_26: This is an equivalent representation to memd_d5_26 * with the minor variation that the data is stored in five * registers, R0, ... R4. The register sequence may be presented as * five explicit arguments or as a register array (Register[]) of * length five. * * As with the memory representation this can be considered to be a * base 2^26 representation of the 130 bit value with 5 digits. * However, during product cross-multiplication the registers R0, * ... R4 may transiently store a value large than 26 bits which * represents a 26-bit 'digit' in the low 26-bits and a 'carry' in * the higher 30 bits. Carry bits are eventually propagated up to * higher registers ('digits)) or, when performing modulo 2^130-5 * reduction, back into to lower registers. * * gpr_d3_52: In this representation 130 bits is stored as 3 52-bit * 'limbs' in 3 registers, R0, ... R2. The register sequence may be * presented as 3 explicit arguments, or as a register array (type * Register[]) of length 3. * * This can be considered to be a base 2^52 representation of the * 130 bit value with 3 digits. Note that R2 normally only contains * a 26 bit 'digit'. * * gpr_2d3_52: In this representation 130 bits is stored as 3 52-bit * 'limbs' in the low elements of 3 register pairs (type RegPair, * field _lo). The register pairs are presented as an array (type * RegPair[3]), possibly embedded in a wrapper management class * (type RegPairs). * * This can be considered to be a base 2^52 representation of the * 130 bit value with the 3 digits stored in successive low * registers. However, during product cross-multiplication the * combined 128 bit derived from concatenating the lower and upper * registers may transiently store a 108 bit value with a 52-bit * 'digit' in the low 52-bits and a 56 bit 'carry' in the high 12 * bits of the lower register and low 44 bits of the upper * register. Carry bits are eventually propagated through to higher * registers or, in the case of modulo 2^130-5 reduction, back into * to lower registers. * * Vector Register Single Value Representations * * vec_d5_26: This is an equivalent representation to memd_d5_26 * with the minor variation that each 26 bit limb is stored in the * lower DWORDs of five vector registers, V0.d[0], ..., V4.d[0]. The * register sequence may be presented as five explicit arguments or * as a vector register array (type FloatRegister[]) of length five. * * As with the memory representation this can be considered to be a * base 2^26 representation of the 130 bit value with 5 digits. * However, during product cross-multiplication individual DWORDS * may transiently store a value larger than 26 bits which comprises * a 26-bit 'digit' in the low 26-bits and a 'carry' in the higher * 30 bits. * * vec_2s3_26: In this representation the five 26-bit limbs of a * 130-bit value are packed into the lower SWORD pairs of three * vector registers, V0.s[0], V0.s[1], V1.s[0], V1.s[1], V2.s[0]. * * This can also be considered to be a base 2^26 representation of * the 130 bit value with 5 digits. * * vec4s2_26: In this representation the five 26-bit limbs of a * 130-bit value are packed into five SWORDs of two vector * registers, V0.s[0], V0.s[1], V0.s[1], V0.s[2], V1.s[0]. * * This can also be considered to be a base 2^26 representation of * the 130 bit value with 5 digits. * * Vector Register Dual Value Representations * * Vector register algorithms frequently benefit from use of * instructions that employ 2 lane vector parallelism. This requires * input, intermediate or output data in the following formats that * double up elements defined in the single word value vector * register representations. * * vec_2d5_26 In this representation five limbs of a 130-bit value * are stored in the lower DWORDs of five vector registers, V0.d[0], * ..., V4.d[0] as per the vec_d5_26 representation. Five limbs of a * second 130-bit value are also stored in the upper DWORDs of the * same five vector registers, V0.d[1], ..., V4.d[1]. * * The register sequence may be presented as five explicit arguments * or as a vector register array (type FloatRegister[]) of length * five. * * vec_4s3_26: In this representation the five 26-bit limbs of a * 130-bit value are packed into the lower SWORD pairs of three * vector registers, V0.s[0], V0.s[1], V1.s[0], V1.s[1], * V2.s[0]. Five limbs of a second 130-bit value are also stored in * the upper SWORD pairs of the same three vector registers, * V0.s[2], V0.s[3], V1.s[2], V1.s[3], V2.s[2]. * * The register sequence may be presented as three explicit * arguments or as a vector register array (type FloatRegister[]) of * length three. * * vec_4s3_26_I: In this variation on vec_4s3_26 the 5 26-bit limbs * of a 130-bit value are packed into the even SWORD pairs of 3 * vector registers, V0.s[0], V0.s[2], V1.s[0], V1.s[2], V2.s[0]. 5 * limbs of a second 130-bit value are stored in the odd SWORD pairs * of the same 3 vector registers, V0.s[1], V0.s[3], V1.s[1], * V1.s[3], V2.s[1]. * * The register sequence may be presented as three explicit * arguments or as a vector register array (type FloatRegister[]) of * length three. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1713: > 1711: > 1712: // Poly1305 > 1713: Comment /* * Loads five 26-bit limbs located at the address in src in * mem_d5_26 format into registers dest0, ... dest3 in gpr_d3_52 * format. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1720: > 1718: void shifted_add128(const RegPair d, const RegPair s, unsigned int shift, > 1719: Register scratch = rscratch1); > 1720: // Widening multiply s * r -> u Comment /* * Appends instructions to the generator which implement a * widening 130-bit multiply u <-- s * r mod 2^130-5 * * The three elements of s encode a 130-bit value in gpr_d3_52 format. * * The three elements of r encode a 130-bit key in gpr_d3_52 format. * * The three elements of s are cross-multiplied with the three * elements of r and the results are accumulated as 128 bit * products in the three register pairs u in gpr_2d3_52 format. * * RR2, which must be pre-initialized to 5*(r[2] << 26), is used in * place of r for some of the cross-multiplies in order to enable * correct mod-130 reduction. * * scratch is a temporary register which is overwritten during * computation of the product. */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1724: > 1722: const RegPair u[], const Register s[], const Register r[], > 1723: Register RR2, RegSetIterator scratch); > 1724: // Multiply mod 2**130-5 Comment /* * Appends instructions to the current code buffer which implement a * 130-bit widening multiply u <-- (s * r) mod 2^130-5 by calling * * poly1305_multiply(acc, u, r, s, RR2, scratch) * acc.gen(); */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1728: > 1726: const RegPair u[], const Register s[], > 1727: const Register r[], > 1728: Register RR2, RegSetIterator scratch); Comment /* * Appends instructions to the current code buffer which implement a * 130-bit multiply (s * r) mod 130 -> u by calling * poly1305_multiply(acc, u, r, s, RR2, scratch) * acc.gen(); */ src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1736: > 1734: acc.gen(); > 1735: } > 1736: Comment /* * Appends instructions to the generator which implement a vector * parallel 2-way SIMD widening 130-bit multiply u <-- s * r mod * 2^130-5. * * The three elements of s encode two 130-bit values in * vec_4s3_26_I format. * * The two elements of r encode a 130-bit key in vec_4d_26 format. * * The two elements of rr must be pre-initialized by the caller in * vec_4d_26 format format to (5 * r). * * Five SWORD elements in the even lanes of s are paired with five * corresponding SWORD elements in the odd lanes of s and * cross-multiplied in parallel with each of the five SWORD * elements of r. The pairs of DWORDs which result are * accumulated, respectively, in the lower and upper DWORD lanes * of u in vec_2d5_26 format. * * The SWORD elements of rr, which must be initialised to 5 * r, * are used in place of r for some of the cross-multiplies in * order to ensure correct reduction modulo 2^130 - 5. */ Also, change m[] to s[] in the method declaration. src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1739: > 1737: void poly1305_multiply_vec(AsmGenerator &acc, > 1738: const FloatRegister u[], const FloatRegister m[], > 1739: const FloatRegister r[], const FloatRegister rr[]); Comment /* * Appends instructions to the generator which implement a widening * 130-bit multiply u <-- s * r mod 2^130-5 by calling * poly1305_multiply(acc, u, r, s, RR2, scratch) * acc.gen(); */ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1409606380 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1409606444 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1409613796 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1410549423 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1410536242 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1410583483 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1410712922 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1412366611 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1412382710 PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1412370579 From ayang at openjdk.org Tue Jan 9 13:07:30 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Jan 2024 13:07:30 GMT Subject: RFR: 8323284: Remove unused FilteringClosure declaration Message-ID: Trivial removing dead code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/17323/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17323&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323284 Stats: 3 lines in 2 files changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17323.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17323/head:pull/17323 PR: https://git.openjdk.org/jdk/pull/17323 From stefank at openjdk.org Tue Jan 9 13:24:26 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Jan 2024 13:24:26 GMT Subject: RFR: 8323284: Remove unused FilteringClosure declaration In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 12:58:24 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17323#pullrequestreview-1811141254 From ayang at openjdk.org Tue Jan 9 13:34:22 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Jan 2024 13:34:22 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 05:31:51 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > #16928 found this. @LizBing Now that https://github.com/openjdk/jdk/pull/16842 is merged, could you sync this PR to master? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1883057876 From tschatzl at openjdk.org Tue Jan 9 13:40:22 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 Jan 2024 13:40:22 GMT Subject: RFR: 8323284: Remove unused FilteringClosure declaration In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 12:58:24 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17323#pullrequestreview-1811168005 From adinn at openjdk.org Tue Jan 9 14:04:30 2024 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 9 Jan 2024 14:04:30 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v5] In-Reply-To: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> References: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> Message-ID: On Mon, 4 Dec 2023 17:33:00 GMT, Andrew Haley wrote: >> Vectorizing Poly1305 is quite tricky. We already have a highly- >> efficient scalar Poly1305 implementation that runs on the core integer >> unit, but it's highly serialized, so it does not make make good use of >> the parallelism available. >> >> The scalar implementation takes advantage of some particular features >> of the Poly1305 keys. In particular, certain bits of r, the secret >> key, are required to be 0. These make it possible to use a full >> 64-bit-wide multiply-accumulate operation without needing to process >> carries between partial products, >> >> While this works well for a serial implementation, a parallel >> implementation cannot do this because rather than multiplying by r, >> each step multiplies by some integer power of r, modulo >> 2^130-5. >> >> In order to avoid processing carries between partial products we use a >> redundant representation, in which each 130-bit integer is encoded >> either as a 5-digit integer in base 2^26 or as a 3-digit integer in >> base 2^52, depending on whether we are using a 64- or 32-bit >> multiply-accumulate. >> >> In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate >> operation available to us, so we must use 32*32 -> 64-bit operations. >> >> In order to achieve maximum performance we'd like to get close to the >> processor's decode bandwidth, so that every clock cycle does something >> useful. In a typical high-end AArch64 implementation, the core integer >> unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a >> fast(ish) two-way 32-bit multiplier, which may be slower than than the >> core integer unit's. It is not at all obvious whether it's best to use >> ASIMD or core instructions. >> >> Fortunately, if we have a wide-bandwidth instruction decode, we can do >> both at the same time, by feeding alternating instructions to the core >> and the ASIMD units. This also allows us to make good use of all of >> the available core and ASIMD registers, in parallel. >> >> To do this we use generators, which here are a kind of iterator that >> emits a group of instructions each time it is called. In this case we >> 4 parallel generators, and by calling them alternately we interleave >> the ASIMD and the core instructions. We also take care to ensure that >> each generator finishes at about the same time, to maximize the >> distance between instructions which generate and consume data. >> >> The results are pretty good, ranging from 2* - 3* speedup. It is >> possible that a pure in-order processor (Raspberry Pi?) migh... > > Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: > > - Whitespace > - Whitespace src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7502: > 7500: } > 7501: } > 7502: I forgot to add some feedback on this assert and the preceding loop. I think it would be good to explain what is going on here // The column countdowns and updates in the above loop // are designed to schedule execution of the instruction // generation closures in each of the generator queues // so as to interleave the resulting parallel instruction // streams as evenly as possible. It should be self-evident // that the following assertion must hold once all the // queues have been exhausted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1446122911 From adinn at openjdk.org Tue Jan 9 14:22:32 2024 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 9 Jan 2024 14:22:32 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v5] In-Reply-To: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> References: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> Message-ID: On Mon, 4 Dec 2023 17:33:00 GMT, Andrew Haley wrote: >> Vectorizing Poly1305 is quite tricky. We already have a highly- >> efficient scalar Poly1305 implementation that runs on the core integer >> unit, but it's highly serialized, so it does not make make good use of >> the parallelism available. >> >> The scalar implementation takes advantage of some particular features >> of the Poly1305 keys. In particular, certain bits of r, the secret >> key, are required to be 0. These make it possible to use a full >> 64-bit-wide multiply-accumulate operation without needing to process >> carries between partial products, >> >> While this works well for a serial implementation, a parallel >> implementation cannot do this because rather than multiplying by r, >> each step multiplies by some integer power of r, modulo >> 2^130-5. >> >> In order to avoid processing carries between partial products we use a >> redundant representation, in which each 130-bit integer is encoded >> either as a 5-digit integer in base 2^26 or as a 3-digit integer in >> base 2^52, depending on whether we are using a 64- or 32-bit >> multiply-accumulate. >> >> In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate >> operation available to us, so we must use 32*32 -> 64-bit operations. >> >> In order to achieve maximum performance we'd like to get close to the >> processor's decode bandwidth, so that every clock cycle does something >> useful. In a typical high-end AArch64 implementation, the core integer >> unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a >> fast(ish) two-way 32-bit multiplier, which may be slower than than the >> core integer unit's. It is not at all obvious whether it's best to use >> ASIMD or core instructions. >> >> Fortunately, if we have a wide-bandwidth instruction decode, we can do >> both at the same time, by feeding alternating instructions to the core >> and the ASIMD units. This also allows us to make good use of all of >> the available core and ASIMD registers, in parallel. >> >> To do this we use generators, which here are a kind of iterator that >> emits a group of instructions each time it is called. In this case we >> 4 parallel generators, and by calling them alternately we interleave >> the ASIMD and the core instructions. We also take care to ensure that >> each generator finishes at about the same time, to maximize the >> distance between instructions which generate and consume data. >> >> The results are pretty good, ranging from 2* - 3* speedup. It is >> possible that a pure in-order processor (Raspberry Pi?) migh... > > Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: > > - Whitespace > - Whitespace src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 66: > 64: } > 65: > 66: template This probably also needs a comment. /* * Appends a closure to the generator that can be executed * later to append one or more instructions to the target code * buffer. Normally each closure generates a single instruction, * allowing multiple generators that interleave parallel instruction * streams to obtain the maximum opportunities for pipeline * parallelism. In cases where an individual instruction stream * uses a shared scratch register to hold a temporary that will be * consumed by a later instruction the closure must generate the * full instruction sequence between the writer and last reader of * the temporary as a block, ensuring that instructions from * parallel streams which write or read the same scratch register * cannot side-affect each other. */ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1446147364 From ayang at openjdk.org Tue Jan 9 14:26:31 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Jan 2024 14:26:31 GMT Subject: RFR: 8323284: Remove unused FilteringClosure declaration In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 12:58:24 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17323#issuecomment-1883139898 From ayang at openjdk.org Tue Jan 9 14:26:32 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Jan 2024 14:26:32 GMT Subject: Integrated: 8323284: Remove unused FilteringClosure declaration In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 12:58:24 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 438ab7c1 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/438ab7c115249d7501edfbb2d3c62e96ae824181 Stats: 3 lines in 2 files changed: 0 ins; 3 del; 0 mod 8323284: Remove unused FilteringClosure declaration Reviewed-by: stefank, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/17323 From stefank at openjdk.org Tue Jan 9 15:03:42 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Jan 2024 15:03:42 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines Message-ID: There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. ------------- Commit messages: - 8323297: Fix incorrect placement of precompiled.hpp include lines Changes: https://git.openjdk.org/jdk/pull/17326/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17326&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323297 Stats: 16 lines in 5 files changed: 7 ins; 9 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17326/head:pull/17326 PR: https://git.openjdk.org/jdk/pull/17326 From epeter at openjdk.org Tue Jan 9 15:20:33 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 9 Jan 2024 15:20:33 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v6] In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 20:48:39 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: > > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - Stomped on r13 in switch branch calculation > - ... and 11 more: https://git.openjdk.org/jdk/compare/8a4dc79e...600377b0 @asgibbons I cannot yet promise to review this, I just left a few comments after scrolling through this change. I'm especially scared of reviewing `src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp`. I launched testing for Commit 21 / v05. Maybe an ignorant question: How would avx512 be affected? src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 83: > 81: > 82: // const __m256i first = _mm256_set1_epi8(needle[0]); > 83: // const __m256i last = _mm256_set1_epi8(needle[k - 1]); I think it would be nicer if you had comment `//` on every line, and no gaps. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1608: > 1606: // vector compares when size is 2 * VEC_SIZE or less. 38 8. Use 4 > 1607: // vector compares when size is 4 * VEC_SIZE or less. 39 9. Use 8 > 1608: // vector compares when size is 8 * VEC_SIZE or less. */ Is this formatting intended? src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1672: > 1670: > 1671: // 98 VPCMPEQ VEC_SIZE(%rdi), %ymm2, %ymm2 > 1672: // 99 vpmovmskb %ymm2, %eax It seems that here the comments and code is strangely interleaved / shifted. What is this all for? src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 2301: > 2299: // 388 setg %dl > 2300: // 389 leal -1(%rdx, %rdx), %eax > 2301: __ movzbl(rcx, Address(rsi, rax, Address::times_1, -0x20)); Down here it is even worse test/jdk/java/lang/StringBuffer/IndexOf.java line 34: > 32: public class IndexOf { > 33: > 34: static Random generator = new Random(1999); Would it be an alternative to use the this: import jdk.test.lib.Utils; ... Random random = Utils.getRandomInstance(); This has a random seed, but it is always printed in the output and can be set via a test-flag. test/jdk/java/lang/StringBuffer/IndexOf.java line 44: > 42: } > 43: System.out.println(""); > 44: generator.setSeed(1999); Is there a good reason for a fixed seed? ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16753#pullrequestreview-1811353722 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446211178 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446210544 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446216019 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446217305 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446221928 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446223038 From jbhateja at openjdk.org Tue Jan 9 15:35:29 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 9 Jan 2024 15:35:29 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v5] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 15:21:08 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Addressing review comments. src/hotspot/share/opto/library_call.cpp line 1202: > 1200: > 1201: Node* result = nullptr; > 1202: bool do_intrinsic = Name change suggestion: do_intrinsic -> call_opt_stub src/hotspot/share/opto/library_call.cpp line 1224: > 1222: result = _gvn.transform(new ProjNode(call, TypeFunc::Parms)); > 1223: } else { > 1224: result = make_indexOf_node(src_start, src_count, tgt_start, tgt_count, Existing routines emits IR to handle following special cases 1) tgt_cnt > src_cnt return -1 2) tgt_cnt == 0 return 0. Should we not be preserving those check before calling stub ? src/hotspot/share/opto/library_call.cpp line 1273: > 1271: Node* result = nullptr; > 1272: > 1273: if ((StubRoutines::string_indexof() != nullptr) && (ae == StrIntrinsicNode::LL)) { Why are we not calling stub for StrIntrinsicNode::UU ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1444390460 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1444406814 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1444420392 From duke at openjdk.org Tue Jan 9 15:52:46 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Tue, 9 Jan 2024 15:52:46 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v9] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - resolve conflicts - Merge branch 'master' of https://git.openjdk.org/jdk into JDK-8234502 - fix import statement - fix import statement - remove 'GenCollectedHeap' from 'jdk.hotspot.agent' - resolve conflict - Merge branch 'master' of https://git.openjdk.org/jdk into JDK-8234502 - restore comment - line-break for EOF - merge 'CollectedHeap' and 'SerialHeap' ------------- Changes: https://git.openjdk.org/jdk/pull/16927/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=08 Stats: 3170 lines in 21 files changed: 1507 ins; 1638 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From adinn at openjdk.org Tue Jan 9 16:09:34 2024 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 9 Jan 2024 16:09:34 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v5] In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 12:21:41 GMT, Andrew Haley wrote: >> Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: >> >> - Whitespace >> - Whitespace > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7527: > >> 7525: __ bind(DONE); >> 7526: } >> 7527: __ poly1305_fully_reduce(S0, u0); > > This call to `poly1305_fully_reduce` is probably unnecessary, because the caller invokes `IntegerPolynomial1305::finalCarryReduceLast`. However, this part of the contract is undocumented. Well, I guess we leave this last call in place then. It probably won't cost much relative to the rest of the work that gets done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16812#discussion_r1446293115 From ayang at openjdk.org Tue Jan 9 16:30:27 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Jan 2024 16:30:27 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v9] In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 15:52:46 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - resolve conflicts > - Merge branch 'master' of https://git.openjdk.org/jdk into JDK-8234502 > - fix import statement > - fix import statement > - remove 'GenCollectedHeap' from 'jdk.hotspot.agent' > - resolve conflict > - Merge branch 'master' of https://git.openjdk.org/jdk into JDK-8234502 > - restore comment > - line-break for EOF > - merge 'CollectedHeap' and 'SerialHeap' Running testing now. The patch mostly looks good. src/hotspot/share/gc/shared/vmStructs_gc.hpp line 174: > 172: declare_toplevel_type(ContiguousSpace*) \ > 173: declare_toplevel_type(DefNewGeneration*) \ > 174: declare_toplevel_type(Generation*) \ Can these two *Generation be removed also? (They are specific to Serial, while this file is for "shared".) ------------- PR Review: https://git.openjdk.org/jdk/pull/16927#pullrequestreview-1811525103 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1446315896 From clanger at openjdk.org Tue Jan 9 17:51:03 2024 From: clanger at openjdk.org (Christoph Langer) Date: Tue, 9 Jan 2024 17:51:03 GMT Subject: [jdk22] RFR: 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 Message-ID: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> Hi all, This pull request contains a backport of [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163), commit [12308533](https://github.com/openjdk/jdk/commit/1230853343c38787c90820d19d0626f0c37540dc) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Matthias Baesken on 22 Dec 2023 and was reviewed by Martin Doerr and Christoph Langer. The bug is P3 and hence appropriate for RDP1. It quieces a test error that we see regularly on Alpine. Thanks! ------------- Commit messages: - Backport 1230853343c38787c90820d19d0626f0c37540dc Changes: https://git.openjdk.org/jdk22/pull/43/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=43&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322163 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/43.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/43/head:pull/43 PR: https://git.openjdk.org/jdk22/pull/43 From zsong at openjdk.org Tue Jan 9 17:51:03 2024 From: zsong at openjdk.org (Zhao Song) Date: Tue, 9 Jan 2024 17:51:03 GMT Subject: [jdk22] RFR: 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 In-Reply-To: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> References: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> Message-ID: On Tue, 9 Jan 2024 13:33:49 GMT, Christoph Langer wrote: > Hi all, > > This pull request contains a backport of [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163), commit [12308533](https://github.com/openjdk/jdk/commit/1230853343c38787c90820d19d0626f0c37540dc) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Matthias Baesken on 22 Dec 2023 and was reviewed by Martin Doerr and Christoph Langer. > > The bug is P3 and hence appropriate for RDP1. It quieces a test error that we see regularly on Alpine. > > Thanks! Add a comment to update this pr and make skara bot evaluate this pr again. ------------- PR Comment: https://git.openjdk.org/jdk22/pull/43#issuecomment-1883511302 From clanger at openjdk.org Tue Jan 9 19:32:36 2024 From: clanger at openjdk.org (Christoph Langer) Date: Tue, 9 Jan 2024 19:32:36 GMT Subject: [jdk22] RFR: 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 [v2] In-Reply-To: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> References: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> Message-ID: > Hi all, > > This pull request contains a backport of [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163), commit [12308533](https://github.com/openjdk/jdk/commit/1230853343c38787c90820d19d0626f0c37540dc) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Matthias Baesken on 22 Dec 2023 and was reviewed by Martin Doerr and Christoph Langer. > > The bug is P3 and hence appropriate for RDP1. It quieces a test error that we see regularly on Alpine. > > Thanks! Christoph Langer has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Backport 1230853343c38787c90820d19d0626f0c37540dc ------------- Changes: - all: https://git.openjdk.org/jdk22/pull/43/files - new: https://git.openjdk.org/jdk22/pull/43/files/d2a6decc..3c67a25d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk22&pr=43&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk22&pr=43&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/43.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/43/head:pull/43 PR: https://git.openjdk.org/jdk22/pull/43 From dcubed at openjdk.org Tue Jan 9 21:55:03 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 9 Jan 2024 21:55:03 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v11] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:30:49 GMT, Axel Boldt-Christmas wrote: >> LM_LIGHTWEIGHT only uses the lock bits for its locking. This leaves the hashCode bits free when a monitor is not inflated. So instead of inflating when installing the hashCode on a fast locked object it can simply use the hashCode bits in the markWord. >> >> The mark word transitions Unlocked (0b01) <=> Locked (0b00) are done by retrying the CAS if it fails due to non-lock bit changes. >> The mark word transitions Monitor (0b10) <=> Locked/Unlocked (0b0X) are the same as before, inflation already handles hash codes. This change does not interact with the mark word if it is in a Monitor (0b10) state, so the strong CAS which is used for deflation are still valid, and will not fail to any other reason than the cooperative race to help transition the mark word during deflation. >> >> This is dependent on JDK-8319778 simply because JDK-8319797 is dependent on both this and JDK-8319778. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 > - Fix copy paste typo. > - Update src/hotspot/share/opto/library_call.cpp > > Co-authored-by: Tobias Hartmann > - Add retry CAS comment > - Use is_neutral over is_unlocked > - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 > - ... and 7 more: https://git.openjdk.org/jdk/compare/1afccd81...1b907f90 Marked as reviewed by dcubed (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16603#pullrequestreview-1812074690 From dcubed at openjdk.org Tue Jan 9 21:55:03 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 9 Jan 2024 21:55:03 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v11] In-Reply-To: References: <2MRTHFoYSaSW2NH922LOEvqKx4NLjshWaHJaYV2RdVY=.e234046a-aac8-4d7b-81b9-269506944165@github.com> Message-ID: On Mon, 4 Dec 2023 09:49:32 GMT, Axel Boldt-Christmas wrote: >> I don't think the race with deflation is limited to LM_LIGHTWEIGHT. The inflation >> code below detects when there is a collision with async deflation and retries >> which can lead to a re-inflation when we loop around again. We can reach the >> code below with LM_LEGACY, LM_LIGHTWEIGHT, or LM_MONITOR so I don't >> think you need the LM_LIGHTWEIGHT specific comment. >> >> Yes, we can reach this point in the code when `mark.has_monitor() == true` and >> not just when `LockingMode == LM_LIGHTWEIGHT`, but the `inflate()` function >> already has to handle that race (and it does). When a Java monitor is lightweight >> locked or stack-locked, there can be more than one contending thread and each >> of those threads will attempt to `inflate()` the Java monitor into an ObjectMonitor. >> Only one thread can win the inflation race and all of the racers trust `inflate()` >> to do the right thing. What's the "right thing"? One of the callers to `inflate()` will >> install the ObjectMonitor successfully and return it to that caller. All of the other >> callers to `inflate()` will detect that they lost the race and return the winner's >> ObjectMonitor to their callers. >> >> There's no reason for the logic to skip the call to `inflate()` because races are >> already handled by `inflate()`. >> >> We got into this spiraling thread because we were trying to figure out if a >> non-JavaThread could call `inflate()` because `inflate()` can call `is_lock_owned()` >> which has a header comment which talks about non-JavaThreads... >> >> I believe that is possible with JVM/TI tagging even when we are in >> LM_LIGHTWEIGHT mode because a lightweight monitor can be inflated >> by a contending thread which can cause the ObjectMonitor to have an >> anonymous owner. In that case, this if-statement in `inflate()` can execute: >> >> if (LockingMode == LM_LIGHTWEIGHT && inf->is_owner_anonymous() && is_lock_owned(current, object)) { >> inf->set_owner_from_anonymous(current); >> JavaThread::cast(current)->lock_stack().remove(object); >> } >> >> Of course, if our caller is the VMThread, `is_lock_owned()` will return >> false so we won't execute the if-statement's code block. > > There might be some confusion about what I am asking for here. > > This enhancement is to avoid inflating monitors when installing hash codes on objects with LM_LIGHTWEIGHT. The current state of the PR does this except for when it is racing with deflation. It is very possible to avoid inflating for the race as well. The question is not whether the race is handled, rather that it could be handled in such a way that installing a hash code would never cause monitor inflation. > > My question in this thread is whether we should handle this case. > > As already stated my opinion is let the race be handled by inflating and accept that we get some occasional `InflateCause::inflate_cause_hash_code` even with LM_LIGHTWEIGHT. But I do believe that there should be a comment about this. > > And if the consensus is to instead handle the race by retrying (and thus avoiding inflation completely), then we should split out the lightweight FastHashCode into its own loop. > >> We got into this spiraling thread because we were trying to figure out if a >> non-JavaThread could call `inflate()` because `inflate()` can call `is_lock_owned()` >> which has a header comment which talks about non-JavaThreads... > > I think it ultimately is because this enhancement claims to avoid inflating monitors, so why would `is_lock_owned()` be needed, but it is not the case as the current implementation does not handle the potential race with deflation. > > I wanted to add a comment to make it clear that this is known and intentional. Okay I've re-read this group of comments multiple times and above you wrote: > Regardless if we were to just go with it as it is now there should probably be a comment here along the line: > // With LM_LIGHTWEIGHT FastHashCode may race with deflation here and cause a monitor to be re-inflated. To fully understand whether the comment is correct or not, I need to understand where the "here" is that you want to place this comment. You deleted old line 972 thru old line 978. Is this new comment going to be a replacement for those lines? Or will the new comment be somewhere else? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16603#discussion_r1446653178 From dcubed at openjdk.org Tue Jan 9 21:57:27 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 9 Jan 2024 21:57:27 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v7] In-Reply-To: References: Message-ID: <3JssZHNxECXhf6OX1Os0AVfo27KkkkQA9y8aGdCajhk=.dcd127a1-8ded-4c3e-8381-c6967630d490@github.com> On Mon, 4 Dec 2023 12:40:54 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - top load adjustments > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Fix type > - Move inflated check in fast_locked > - Move top load > - 8319799: Recursive lightweight locking: x86 implementation > - Cleanup: C2 fast_lock/fast_unlock x86 The last merge was on 2023.12.04 so I'll review again after this PR is merged with newer jdk/jdk baseline bits. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16607#issuecomment-1883859601 From kbarrett at openjdk.org Tue Jan 9 22:00:24 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:00:24 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: <1qZ03utFU_mLIYTfzlQeXUkcWEwv8d99Q2NPhzEvQzE=.e745b646-65a1-4c36-9594-f3a6a24a3022@github.com> On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17326#pullrequestreview-1812083467 From amenkov at openjdk.org Tue Jan 9 22:28:21 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 9 Jan 2024 22:28:21 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 21:32:50 GMT, Alex Menkov wrote: > FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. > It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. > > FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) > > Testing: > - tier1..3 > - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic > including > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; > - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. Ping. Could I get review please. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17094#issuecomment-1883898569 From kbarrett at openjdk.org Tue Jan 9 22:30:14 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:30:14 GMT Subject: RFR: 8322880: Eliminate -Wparentheses warnings in arm32 code [v2] In-Reply-To: References: Message-ID: > Please review this change to eliminate some -Wparentheses warnings. In most > cases, this involved simply adding a few parentheses to make some implicit > operator precedence explicit. Exceptions are: > > In the clear_array instruct, removed extraneous parens in a declaration: > `Label(loop);` => `Label loop;` > > In NativeMovConstReg::set_data, changed `&` => `&&`. This is conceptually a > bug fix, but the old code "accidentally" worked. > > Testing: Local (linux-x64) cross-build for linux-arm32. Also ran GHA with > -Wparentheses enabled along with this and other changes needed to make that > work. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into arm32-wparentheses - Fix -Wparentheses warnings in arm32 code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17300/files - new: https://git.openjdk.org/jdk/pull/17300/files/d939f43b..00c21d10 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17300&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17300&range=00-01 Stats: 12461 lines in 152 files changed: 9585 ins; 1562 del; 1314 mod Patch: https://git.openjdk.org/jdk/pull/17300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17300/head:pull/17300 PR: https://git.openjdk.org/jdk/pull/17300 From kbarrett at openjdk.org Tue Jan 9 22:30:15 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:30:15 GMT Subject: RFR: 8322880: Eliminate -Wparentheses warnings in arm32 code [v2] In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 08:21:03 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into arm32-wparentheses >> - Fix -Wparentheses warnings in arm32 code > > Looks good. Thanks Thanks for reviews, @dholmes-ora and @shipilev . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17300#issuecomment-1883898609 From kbarrett at openjdk.org Tue Jan 9 22:30:18 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:30:18 GMT Subject: Integrated: 8322880: Eliminate -Wparentheses warnings in arm32 code In-Reply-To: References: Message-ID: <3sYYwmWwO5vFhK1MO9D4eBlrgBGQZ6WugatJKnndyW4=.9e87ad97-53a0-4a09-9f48-17897aa9bf8e@github.com> On Mon, 8 Jan 2024 09:29:38 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. In most > cases, this involved simply adding a few parentheses to make some implicit > operator precedence explicit. Exceptions are: > > In the clear_array instruct, removed extraneous parens in a declaration: > `Label(loop);` => `Label loop;` > > In NativeMovConstReg::set_data, changed `&` => `&&`. This is conceptually a > bug fix, but the old code "accidentally" worked. > > Testing: Local (linux-x64) cross-build for linux-arm32. Also ran GHA with > -Wparentheses enabled along with this and other changes needed to make that > work. This pull request has now been integrated. Changeset: e9f7db30 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/e9f7db304559cbc8e2b46ea30496d3c570569f4c Stats: 13 lines in 6 files changed: 0 ins; 0 del; 13 mod 8322880: Eliminate -Wparentheses warnings in arm32 code Reviewed-by: shade, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17300 From dcubed at openjdk.org Tue Jan 9 22:36:02 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 9 Jan 2024 22:36:02 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v11] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:30:49 GMT, Axel Boldt-Christmas wrote: >> LM_LIGHTWEIGHT only uses the lock bits for its locking. This leaves the hashCode bits free when a monitor is not inflated. So instead of inflating when installing the hashCode on a fast locked object it can simply use the hashCode bits in the markWord. >> >> The mark word transitions Unlocked (0b01) <=> Locked (0b00) are done by retrying the CAS if it fails due to non-lock bit changes. >> The mark word transitions Monitor (0b10) <=> Locked/Unlocked (0b0X) are the same as before, inflation already handles hash codes. This change does not interact with the mark word if it is in a Monitor (0b10) state, so the strong CAS which is used for deflation are still valid, and will not fail to any other reason than the cooperative race to help transition the mark word during deflation. >> >> This is dependent on JDK-8319778 simply because JDK-8319797 is dependent on both this and JDK-8319778. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 > - Fix copy paste typo. > - Update src/hotspot/share/opto/library_call.cpp > > Co-authored-by: Tobias Hartmann > - Add retry CAS comment > - Use is_neutral over is_unlocked > - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 > - ... and 7 more: https://git.openjdk.org/jdk/compare/9672dd84...1b907f90 The last merge was on 2023.12.04 so I'll review again after this PR is merged with newer jdk/jdk baseline bits. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16603#issuecomment-1883906253 From dcubed at openjdk.org Tue Jan 9 22:38:26 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 9 Jan 2024 22:38:26 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v9] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:35:56 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 > - Avoid copy from and to the same location > - Fix typo > - Update unstructured unlock comment > - Fix bad indent after merge > - Remove whitespace > - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 > - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 > - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 > - Fix nit > - ... and 2 more: https://git.openjdk.org/jdk/compare/1b907f90...56b04f58 I re-reviewed the v08/2023.12.04 version. The last merge was on 2023.12.04 so I'll review again after this PR is merged with newer jdk/jdk baseline bits. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16606#pullrequestreview-1812132931 PR Comment: https://git.openjdk.org/jdk/pull/16606#issuecomment-1883908327 From kbarrett at openjdk.org Tue Jan 9 22:50:50 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:50:50 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code [v3] In-Reply-To: References: Message-ID: <84o5NTjOVPsOnq0Cy7imG1lS-GFsnwt558iLkeypTFo=.d4ee72d8-1ec7-49e0-a4f3-89bc6ccf20dd@github.com> > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into riscv-wparentheses - Merge branch 'master' into riscv-wparentheses - simplify frame::equal assert - fix -Wparentheses warnings in riscv code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17216/files - new: https://git.openjdk.org/jdk/pull/17216/files/aefd232c..86125124 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17216&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17216&range=01-02 Stats: 13123 lines in 185 files changed: 10164 ins; 1583 del; 1376 mod Patch: https://git.openjdk.org/jdk/pull/17216.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17216/head:pull/17216 PR: https://git.openjdk.org/jdk/pull/17216 From kbarrett at openjdk.org Tue Jan 9 22:50:52 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:50:52 GMT Subject: RFR: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code [v3] In-Reply-To: <5zFH1FuOfJC2VUtDtmKASW47r2479BvGyODj8c1ntF4=.c0922330-66c9-44aa-b9fc-b0bf60da8657@github.com> References: <5zFH1FuOfJC2VUtDtmKASW47r2479BvGyODj8c1ntF4=.c0922330-66c9-44aa-b9fc-b0bf60da8657@github.com> Message-ID: On Wed, 3 Jan 2024 17:14:25 GMT, Ludovic Henry wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'master' into riscv-wparentheses >> - Merge branch 'master' into riscv-wparentheses >> - simplify frame::equal assert >> - fix -Wparentheses warnings in riscv code > > Marked as reviewed by luhenry (Committer). Thanks for reviews, @luhenry and @RealFYang . Thanks for testing, @zifeihan . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17216#issuecomment-1883918429 From kbarrett at openjdk.org Tue Jan 9 22:50:52 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Jan 2024 22:50:52 GMT Subject: Integrated: 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code In-Reply-To: References: Message-ID: On Tue, 2 Jan 2024 07:55:59 GMT, Kim Barrett wrote: > Please review this change to eliminate some -Wparentheses warnings. This > involved simply adding a few parentheses to make some implicit operator > precedence explicit. > > Testing: Local (linux-x64) cross-build for linux-riscv with this change plus > -Wparentheses enabled and other changes to allow that to work. > > Requesting someone from the riscv porters to properly test this. This pull request has now been integrated. Changeset: a5071e01 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/a5071e010be8c79f1a3cd96f7325d04bac8f7ae0 Stats: 9 lines in 2 files changed: 0 ins; 0 del; 9 mod 8322817: RISC-V: Eliminate -Wparentheses warnings in riscv code Reviewed-by: fyang, luhenry ------------- PR: https://git.openjdk.org/jdk/pull/17216 From sspitsyn at openjdk.org Tue Jan 9 23:20:21 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 9 Jan 2024 23:20:21 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 21:32:50 GMT, Alex Menkov wrote: > FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. > It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. > > FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) > > Testing: > - tier1..3 > - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic > including > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; > - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. Looks good! The copyright headers need to be updated. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17094#pullrequestreview-1812171240 From sspitsyn at openjdk.org Tue Jan 9 23:29:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 9 Jan 2024 23:29:22 GMT Subject: [jdk22] RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 21:28:04 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a backport of commit [0f8e4e0a](https://github.com/openjdk/jdk/commit/0f8e4e0a81257c678e948c341a241dc0b810494f) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 19 Dec 2023 and was reviewed by Leonid Mesnik and Alan Bateman. > > Thanks! Ping! Need, at list, one review for this 22 backport, please. ------------- PR Comment: https://git.openjdk.org/jdk22/pull/23#issuecomment-1883954123 From sspitsyn at openjdk.org Tue Jan 9 23:53:30 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 9 Jan 2024 23:53:30 GMT Subject: RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf Message-ID: This fix adds a ResourceMark missing in the `SetFramePopClosure::do_thread` and `SetFramePopClosure::do_vthread`. Testing: - TBD to submit tiers: 1-5 ------------- Commit messages: - 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf Changes: https://git.openjdk.org/jdk/pull/17332/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17332&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321685 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17332/head:pull/17332 PR: https://git.openjdk.org/jdk/pull/17332 From amenkov at openjdk.org Wed Jan 10 02:06:19 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 10 Jan 2024 02:06:19 GMT Subject: RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 23:48:35 GMT, Serguei Spitsyn wrote: > This fix adds a ResourceMark missing in the `SetFramePopClosure::do_thread` and `SetFramePopClosure::do_vthread`. > > Testing: > - TBD to submit tiers: 1-5 Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17332#pullrequestreview-1812295888 From duke at openjdk.org Wed Jan 10 02:10:33 2024 From: duke at openjdk.org (Liming Liu) Date: Wed, 10 Jan 2024 02:10:33 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v17] In-Reply-To: References: <-yGcrNxBa91rrdyLb4zNbgz_VRuht7MXBpnel_-WWxg=.6eec01fb-03e7-42d4-b07c-d5617f34bdc2@github.com> Message-ID: <-rGEUXeNbcyvvUCSjHzjRdeZ3CN2-1wLJNIr5pPWeCI=.5c943f4d-de1a-4988-8660-2508a1fef843@github.com> On Tue, 9 Jan 2024 10:26:19 GMT, Patrick Zhang wrote: >> [Sorry, I lost track of this and didn't respond to the earlier comment from >> @jdksjolen.] >> >> Yes, that's correct. The reason for adding the safe for concurrent use >> pretouch mechanism was https://bugs.openjdk.org/browse/JDK-8260332. >> >> The idea is that presently, when a thread needs to expand the oldgen, it >> pretouches while holding the expansion lock. Any other threads that also need >> need the oldgen to be expanded have to wait until the holder of that lock >> completes. Most of the work involved in expansion is quick and short, but not >> so much for pretouching. So it was found that we're sometimes blocking a >> bunch of threads for a long-ish time. >> >> The original proposal there was to allow the otherwise waiting threads to >> cooperate in the pretouch. But the protocol involved was complicated and >> messy. A simpler approach was suggested; allow other threads to use the newly >> expanded memory concurrently with the expanding thread doing the pretouch. >> There's obviously some racing there, with the using threads possibly touching >> pages before the pretouching reaches them, but the thinking is that the >> pretouched wave-front will likely surge ahead of the using threads. And if >> not, then the using threads are effectively cooperating in the "pretouch". >> >> That approach needed https://bugs.openjdk.org/browse/JDK-8272807 as a building >> block. >> >> But I discovered there were a bunch of places with similar problems, >> suggesting the need for some more general mechanism. I did a bit of >> prototyping in that direction, but got distracted by other work and haven't >> gotten back to it. (The idea is to record needed pretouching, deferring it up >> the call chain, to a point where other threads are not being blocked waiting >> for the expansion operation. A complicating factor is that some of those >> places may have multiple distinct memory ranges being allocated and needing >> pretouch, all within the same expansion operation.) >> >> But that approach may interact poorly with the madvise approach. It might be >> that the madvise _should_ be done down inside the expansion operation where >> the pretouches currently happen, rather than being deferred up the call chain >> and permitting the madvise to be concurrent with using threads that might >> introduce the same "shredding" problem the madvise is attempting to fix. That >> would be yet another complicating factor that my prototyping didn't address at >> all. > > @limingliu-ampere 's original test was with JVM flags like: `-Xmx24g -Xms24g -Xmn22g -XX:+UseParallelGC -XX:+AlwaysPreTouch -XX:+TransparentHugePages -XX:-UseAdaptiveSizePolicy` etc. Having `-XX:-UseAdaptiveSizePolicy` ensures that `heap->resize_old_gen(size_policy->calculated_old_free_size_in_bytes());` inside `PSParallelCompact::invoke_no_policy` will not be called in this test (see [psParallelCompact.cpp#L1855](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/parallel/psParallelCompact.cpp#L1855)), then it would **not** run into the concern of allocate/expand/pretouch cooperating case on the oldgen, mentioned above by @kimbarrett. > > With regards to `that approach may interact poorly with the madvise approach`, all pretouching triggered by `PSOldGen::expand(size_t bytes)` are currently wrapped by `MutexLocker x(PSOldGenExpand_lock)`. From this viewpoint, the _madvise_ approach does the pretouching work at the same situation as the original _atomic-add-0_ approach. The proposed patch does not make the potential "shredding" problem on the expansion of oldgen worse. Furthermore, back to the table @limingliu-ampere attached at the initial part of this PR, on Kernel 6.1, with `-XX:+TransparentHugePages`, the _madvise_ approach speeds up the pretouching operation from _atomic-add-0_'s **3.54s** to **0.33s**, which can be an obvious optimization in a manner. > > The initial purpose of this patch was to solve an outstanding performance issue on some commercial benchmarks, especially when running with huge heaps, for example, `-Xms200g`, or `-Xms400g`, together with `-XX:+UseParallelGC -XX:+AlwaysPreTouch -XX:+TransparentHugePages -XX:-UseAdaptiveSizePolicy`. The performance regression got well resolved by the patch, and the improvement vs baseline was up to 30%. > > All in all, I think this 4-month-old PR is a positive change, and solved a practical performance problem effectively. > > @limingliu-ampere will soon have a update to answer @jdksjolen 's question: `2 threads, where one thread is running os::pretouch_memory and another using the memory for something`. The testcase had been changed to use one thread writing some integers to memory and seven threads pretouching memory. And it expects that the contents are preserved after concurrent use and pretouch. The testcase had passed on both linux-x64 and linux-x86. Please check it, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1446806236 From dholmes at openjdk.org Wed Jan 10 02:13:54 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Jan 2024 02:13:54 GMT Subject: RFR: 8323243: JNI invocation of an abstract instance method corrupts the stack Message-ID: See JBS for details. TL;DR - if an instance method is abstract then JNI front-end should throw AbstractMethodError. Testing: - new regression test - tiers 1-3 (sanity) Thanks. ------------- Commit messages: - 8323243: JNI invocation of an abstract instance method corrupts the stack Changes: https://git.openjdk.org/jdk/pull/17337/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17337&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323243 Stats: 151 lines in 4 files changed: 151 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17337.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17337/head:pull/17337 PR: https://git.openjdk.org/jdk/pull/17337 From cjplummer at openjdk.org Wed Jan 10 02:37:23 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 02:37:23 GMT Subject: RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 23:48:35 GMT, Serguei Spitsyn wrote: > This fix adds a ResourceMark missing in the `SetFramePopClosure::do_thread` and `SetFramePopClosure::do_vthread`. > > Testing: > - TBD to submit tiers: 1-5 Looks good. Note, it would be good if the jvmti code added comments explaining why a ResourceMark is needed. For these two cases I had to search around until I stumbled across the following in some unrelated code: // vframes are resource allocated Thread* current_thread = Thread::current(); ResourceMark rm(current_thread); This looks very much like the code you are updating, except the updated code is lacking the comment. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17332#pullrequestreview-1812314880 From amenkov at openjdk.org Wed Jan 10 02:39:33 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 10 Jan 2024 02:39:33 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: Message-ID: > FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. > It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. > > FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) > > Testing: > - tier1..3 > - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic > including > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; > - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: copyright headers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17094/files - new: https://git.openjdk.org/jdk/pull/17094/files/7a790887..76df939f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17094&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17094&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17094.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17094/head:pull/17094 PR: https://git.openjdk.org/jdk/pull/17094 From fyang at openjdk.org Wed Jan 10 03:05:27 2024 From: fyang at openjdk.org (Fei Yang) Date: Wed, 10 Jan 2024 03:05:27 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v3] In-Reply-To: <2LzKv6TzZ3ZJDuLOm1GpNcgoCCfZgOqEOtWDNRQs7O0=.2ced11c4-bba6-4e15-bfa1-f0ca06d53610@github.com> References: <2LzKv6TzZ3ZJDuLOm1GpNcgoCCfZgOqEOtWDNRQs7O0=.2ced11c4-bba6-4e15-bfa1-f0ca06d53610@github.com> Message-ID: On Tue, 19 Dec 2023 17:54:51 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. >> >> >> **perf data summary** >> >> tests intrinsic (ns/op) base (ns/op) speed up (times) >> o.o.b.java.security.MessageDigests.digest (64) 3454.207 12026.787 3.48 >> o.o.b.java.security.MessageDigests.digest (16384) 184063.834 1307913.534 7.11 >> o.o.b.java.security.MessageDigests.getAndDigest (64) 8260.011 17707.156 2.14 >> o.o.b.java.security.MessageDigests.getAndDigest (16384) 191325.246 1379660.864 7.21 >> o.o.b.javax.crypto.full.MacBench.mac (128) 8220.886 34101.577 4.15 >> o.o.b.javax.crypto.full.MacBench.mac (1024) 18006.955 107906.128 5.99 >> o.o.b.javax.crypto.small.MessageDigestBench.digest 11688843.558 82834313.280 7.09 >> >> >> >> **raw perf data - when intrinsic is enabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op >> o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > round 1 review Simply ran `micro:java.security.MessageDigests` JMH on my Lichee-pi-4a board, seems there is a small regression for the `MessageDigests.getAndDigest` (length = 64) case: Before: MessageDigests.digest SHA-1 64 DEFAULT thrpt 15 417.311 ? 2.686 ops/ms MessageDigests.digest SHA-1 16384 DEFAULT thrpt 15 5.206 ? 0.008 ops/ms MessageDigests.getAndDigest SHA-1 64 DEFAULT thrpt 15 404.769 ? 14.810 ops/ms MessageDigests.getAndDigest SHA-1 16384 DEFAULT thrpt 15 5.106 ? 0.046 ops/ms After: MessageDigests.digest SHA-1 64 DEFAULT thrpt 15 518.057 ? 5.935 ops/ms MessageDigests.digest SHA-1 16384 DEFAULT thrpt 15 5.569 ? 0.009 ops/ms MessageDigests.getAndDigest SHA-1 64 DEFAULT thrpt 15 378.184 ? 37.425 ops/ms MessageDigests.getAndDigest SHA-1 16384 DEFAULT thrpt 15 5.515 ? 0.017 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1884116242 From cjplummer at openjdk.org Wed Jan 10 05:07:22 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 05:07:22 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 02:39:33 GMT, Alex Menkov wrote: >> FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). >> The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. >> It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. >> >> FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) >> >> Testing: >> - tier1..3 >> - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic >> including >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; >> - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > copyright headers The debug agent uses `GetClassFields`. You should make sure all jdi, jdwp, and jdb tests are run. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17094#issuecomment-1884197250 From duke at openjdk.org Wed Jan 10 05:15:50 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Wed, 10 Jan 2024 05:15:50 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v10] In-Reply-To: References: Message-ID: <9LDr8BFvEh6MrXxBwB6-X3Zy9NTGvGrt5IMxkXCwi-Q=.7e5f042c-6046-4bb6-a3a3-64dbc6ecdbbb@github.com> > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: remove two '*Generation' ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/92a96cc9..abd3e3cb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=08-09 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From cjplummer at openjdk.org Wed Jan 10 05:31:25 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 05:31:25 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 05:20:27 GMT, Chris Plummer wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> copyright headers > > src/hotspot/share/prims/jvmtiEnv.cpp line 2910: > >> 2908: result_list[i] = jfieldIDWorkaround::to_jfieldID( >> 2909: ik, flds.offset(), >> 2910: flds.access_flags().is_static()); > > I think the indent here should be 4, not 6. You said in the description that the order was reversed, but I don't see where that is getting fixed. It seems it was partially fixed by [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692), but it is only preserving the class hierarchy order, but not the order of fields within each class. If that's all you are attempting to do, then please make it clear in the description. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17094#discussion_r1446898843 From cjplummer at openjdk.org Wed Jan 10 05:31:24 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 05:31:24 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 02:39:33 GMT, Alex Menkov wrote: >> FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). >> The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. >> It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. >> >> FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) >> >> Testing: >> - tier1..3 >> - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic >> including >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; >> - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > copyright headers src/hotspot/share/prims/jvmtiEnv.cpp line 2910: > 2908: result_list[i] = jfieldIDWorkaround::to_jfieldID( > 2909: ik, flds.offset(), > 2910: flds.access_flags().is_static()); I think the indent here should be 4, not 6. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17094#discussion_r1446894480 From dholmes at openjdk.org Wed Jan 10 06:36:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Jan 2024 06:36:21 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. Looks good. It is interesting the shared files did not cause an issue on Windows. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17326#pullrequestreview-1812503870 From stuefe at openjdk.org Wed Jan 10 07:33:33 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 10 Jan 2024 07:33:33 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Thu, 28 Dec 2023 09:26:19 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Use pthread instead Maybe a stupid question, but if we are still worried about concurrent use of memory that is in the process of being madvised, could we not just limit this technique to initialization time? I would expect most uses of pretouch to go together with -Xmx = -Xms, and to happen before mutators start. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1884323834 From mbaesken at openjdk.org Wed Jan 10 07:35:25 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 10 Jan 2024 07:35:25 GMT Subject: [jdk22] RFR: 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 [v2] In-Reply-To: References: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> Message-ID: <2pNpqU7Cwhn1d1irEqBjszy77HbiFTquNH8D8GyIJQY=.0fa42a94-259c-4fed-90d8-18fd61f3a606@github.com> On Tue, 9 Jan 2024 19:32:36 GMT, Christoph Langer wrote: >> Hi all, >> >> This pull request contains a backport of [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163), commit [12308533](https://github.com/openjdk/jdk/commit/1230853343c38787c90820d19d0626f0c37540dc) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. >> >> The commit being backported was authored by Matthias Baesken on 22 Dec 2023 and was reviewed by Martin Doerr and Christoph Langer. >> >> The bug is P3 and hence appropriate for RDP1. It quieces a test error that we see regularly on Alpine. >> >> Thanks! > > Christoph Langer has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Backport 1230853343c38787c90820d19d0626f0c37540dc Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/43#pullrequestreview-1812573367 From alanb at openjdk.org Wed Jan 10 08:03:28 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 10 Jan 2024 08:03:28 GMT Subject: [jdk22] RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 21:28:04 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a backport of commit [0f8e4e0a](https://github.com/openjdk/jdk/commit/0f8e4e0a81257c678e948c341a241dc0b810494f) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 19 Dec 2023 and was reviewed by Leonid Mesnik and Alan Bateman. > > Thanks! Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/23#pullrequestreview-1812613022 From clanger at openjdk.org Wed Jan 10 08:31:33 2024 From: clanger at openjdk.org (Christoph Langer) Date: Wed, 10 Jan 2024 08:31:33 GMT Subject: [jdk22] Integrated: 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 In-Reply-To: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> References: <_KN_oPLsWd_rAYthnE5oURoslpkLBfLJQIQwuMZFU-o=.11efbb12-ae16-4705-a9fa-c402c1cfc428@github.com> Message-ID: <2d8tpla5UL9IcjdWjgAaGvHiFQi0C9u5GfyyxlrOFxY=.e6e87d04-6c12-4cfa-a2d9-638ff80caf45@github.com> On Tue, 9 Jan 2024 13:33:49 GMT, Christoph Langer wrote: > Hi all, > > This pull request contains a backport of [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163), commit [12308533](https://github.com/openjdk/jdk/commit/1230853343c38787c90820d19d0626f0c37540dc) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Matthias Baesken on 22 Dec 2023 and was reviewed by Martin Doerr and Christoph Langer. > > The bug is P3 and hence appropriate for RDP1. It quieces a test error that we see regularly on Alpine. > > Thanks! This pull request has now been integrated. Changeset: 28db238d Author: Christoph Langer URL: https://git.openjdk.org/jdk22/commit/28db238d52b0713a4ecfd15b7e7e4806c2935b3f Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 Reviewed-by: mbaesken Backport-of: 1230853343c38787c90820d19d0626f0c37540dc ------------- PR: https://git.openjdk.org/jdk22/pull/43 From mli at openjdk.org Wed Jan 10 09:14:05 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 10 Jan 2024 09:14:05 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to implement SHA-1 intrinsic for riscv? > Thanks! > > > ## Test > > ### Functionality > > tests under `test/hotspot/jtreg/compiler/intrinsics/sha` > tests found via `find test/jdk -iname "*SHA1*.java"` > > ### Performance > > tested on `T-HEAD Light Lichee Pi 4A` > > benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. > > > **perf data summary** > > tests intrinsic (ns/op) base (ns/op) speed up (times) > o.o.b.java.security.MessageDigests.digest (64) 3454.207 12026.787 3.48 > o.o.b.java.security.MessageDigests.digest (16384) 184063.834 1307913.534 7.11 > o.o.b.java.security.MessageDigests.getAndDigest (64) 8260.011 17707.156 2.14 > o.o.b.java.security.MessageDigests.getAndDigest (16384) 191325.246 1379660.864 7.21 > o.o.b.javax.crypto.full.MacBench.mac (128) 8220.886 34101.577 4.15 > o.o.b.javax.crypto.full.MacBench.mac (1024) 18006.955 107906.128 5.99 > o.o.b.javax.crypto.small.MessageDigestBench.digest 11688843.558 82834313.280 7.09 > > > > **raw perf data - when intrinsic is enabled** > > o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op > o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op > o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? 150.045 ns/op > o.o.b.java.security.MessageDigests.getAndDigest ... Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: - remove tp/gp - refine code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17130/files - new: https://git.openjdk.org/jdk/pull/17130/files/42f838a9..eb020a85 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=02-03 Stats: 94 lines in 1 file changed: 34 ins; 12 del; 48 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From ayang at openjdk.org Wed Jan 10 09:31:26 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 Jan 2024 09:31:26 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v10] In-Reply-To: <9LDr8BFvEh6MrXxBwB6-X3Zy9NTGvGrt5IMxkXCwi-Q=.7e5f042c-6046-4bb6-a3a3-64dbc6ecdbbb@github.com> References: <9LDr8BFvEh6MrXxBwB6-X3Zy9NTGvGrt5IMxkXCwi-Q=.7e5f042c-6046-4bb6-a3a3-64dbc6ecdbbb@github.com> Message-ID: On Wed, 10 Jan 2024 05:15:50 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > remove two '*Generation' `make test CONF=debug TEST=serviceability/sa/ClhsdbThreadContext.java JTREG_JAVA_OPTIONS=-XX:+UseG1GC` causes "AssertionFailure: Should have unknown location" in my testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1884482023 From shade at openjdk.org Wed Jan 10 09:44:24 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jan 2024 09:44:24 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: <8DpHcalXh4Gr6WuBqsXYnCyb8Vq6OlyVaCcbR5aBK0s=.5937c22d-bdad-45a1-bcff-b45e62bd89dc@github.com> On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17326#pullrequestreview-1812799906 From sspitsyn at openjdk.org Wed Jan 10 10:11:21 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Jan 2024 10:11:21 GMT Subject: RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: <6t9yEN1Us1HI7IaQGdtU7RSOXSc9KSQuLTbu_JY-Az8=.24b61e44-60b9-4f14-bd54-b0dd765be646@github.com> On Tue, 9 Jan 2024 23:48:35 GMT, Serguei Spitsyn wrote: > This fix adds a ResourceMark missing in the `SetFramePopClosure::do_thread` and `SetFramePopClosure::do_vthread`. > > Testing: > - TBD to submit tiers: 1-5 Alex and Chris, thank you for review! Chris, I'll add similar comments as you noted. I agree, it will be helpful. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17332#issuecomment-1884547925 From hgreule at openjdk.org Wed Jan 10 10:18:22 2024 From: hgreule at openjdk.org (Hannes Greule) Date: Wed, 10 Jan 2024 10:18:22 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 05:28:39 GMT, Chris Plummer wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 2910: >> >>> 2908: result_list[i] = jfieldIDWorkaround::to_jfieldID( >>> 2909: ik, flds.offset(), >>> 2910: flds.access_flags().is_static()); >> >> I think the indent here should be 4, not 6. > > You said in the description that the order was reversed, but I don't see where that is getting fixed. It seems it was partially fixed by [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692), but it is only preserving the class hierarchy order, but not the order of fields within each class. If that's all you are attempting to do, then please make it clear in the description. FieldStream from reflectionUtils iterates fields in reverse order, so reversing again was previously needed here. JavaFieldStream from fieldStreams (and the new FilteredJavaFieldStream) iterate in the order the fields actually occur, so this double-reversing isn't needed anymore. It's a bit confusing to have FilteredJavaFieldStream in reflectionUtils; eventually it would probably make sense to move the FilteredFieldsMap and FilteredjavaFieldStream into fieldStreams instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17094#discussion_r1447163822 From sspitsyn at openjdk.org Wed Jan 10 10:38:27 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Jan 2024 10:38:27 GMT Subject: [jdk22] RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 21:28:04 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a backport of commit [0f8e4e0a](https://github.com/openjdk/jdk/commit/0f8e4e0a81257c678e948c341a241dc0b810494f) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 19 Dec 2023 and was reviewed by Leonid Mesnik and Alan Bateman. > > Thanks! Alan, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/23#issuecomment-1884591260 From stefank at openjdk.org Wed Jan 10 10:55:22 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Jan 2024 10:55:22 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. Thanks for reviewing! > It is interesting the shared files did not cause an issue on Windows. I was surprised by this as well. I dug some more into this and found this in the documentation: The compiler treats all code occurring before the .h file as precompiled. It skips to just beyond the #include directive associated with the .h file, uses the code contained in the .pch file, and then compiles all code after filename. I tested this by changing zCollectedHeap.cpp to have: #error "COMPILATION_ERROR_1" #include "gc/z/zAddress.hpp" #include "precompiled.hpp" #error "COMPILATION_ERROR_2" The compilation error I then got was: fatal error C1189: #error: "COMPILATION_ERROR_2" So, it seems like the MS compiler previously just skipped the extra includes above the precompiled.hpp include line. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17326#issuecomment-1884616955 From mli at openjdk.org Wed Jan 10 11:00:26 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 10 Jan 2024 11:00:26 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v3] In-Reply-To: References: <2LzKv6TzZ3ZJDuLOm1GpNcgoCCfZgOqEOtWDNRQs7O0=.2ced11c4-bba6-4e15-bfa1-f0ca06d53610@github.com> Message-ID: On Wed, 10 Jan 2024 03:02:09 GMT, Fei Yang wrote: > Simply ran `micro:java.security.MessageDigests` JMH on my Lichee-pi-4a board, seems there is a small regression for the `MessageDigests.getAndDigest` (length = 64) case: > > ``` > Before: > MessageDigests.digest SHA-1 64 DEFAULT thrpt 15 417.311 ? 2.686 ops/ms > MessageDigests.digest SHA-1 16384 DEFAULT thrpt 15 5.206 ? 0.008 ops/ms > MessageDigests.getAndDigest SHA-1 64 DEFAULT thrpt 15 404.769 ? 14.810 ops/ms > MessageDigests.getAndDigest SHA-1 16384 DEFAULT thrpt 15 5.106 ? 0.046 ops/ms > > After: > MessageDigests.digest SHA-1 64 DEFAULT thrpt 15 518.057 ? 5.935 ops/ms > MessageDigests.digest SHA-1 16384 DEFAULT thrpt 15 5.569 ? 0.009 ops/ms > MessageDigests.getAndDigest SHA-1 64 DEFAULT thrpt 15 378.184 ? 37.425 ops/ms > MessageDigests.getAndDigest SHA-1 16384 DEFAULT thrpt 15 5.515 ? 0.017 ops/ms > ``` Hey, Thanks for testing. The data is interesting, seems there is just a little improvement when data is `16384`, compared to my test result more than 7X speedup. I will try different test environment. At the same time, can you help to double check your env whether the sha-1 intrinsic is enabled/disabled as expected? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1884624161 From sspitsyn at openjdk.org Wed Jan 10 11:43:51 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Jan 2024 11:43:51 GMT Subject: RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf [v2] In-Reply-To: References: Message-ID: > This fix adds a ResourceMark missing in the `SetFramePopClosure::do_thread` and `SetFramePopClosure::do_vthread`. > > Testing: > - TBD to submit tiers: 1-5 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge - review: add comments for ResourceMark - 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17332/files - new: https://git.openjdk.org/jdk/pull/17332/files/dd838135..73963e8f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17332&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17332&range=00-01 Stats: 13793 lines in 239 files changed: 10265 ins; 2043 del; 1485 mod Patch: https://git.openjdk.org/jdk/pull/17332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17332/head:pull/17332 PR: https://git.openjdk.org/jdk/pull/17332 From jkern at openjdk.org Wed Jan 10 11:49:05 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Jan 2024 11:49:05 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v12] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: - cosmetic changes - cosmetic changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/acf306d4..d908a969 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=10-11 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From stefank at openjdk.org Wed Jan 10 12:26:32 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Jan 2024 12:26:32 GMT Subject: RFR: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups Message-ID: TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. ------------- Commit messages: - Revert unrelated changes - 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups Changes: https://git.openjdk.org/jdk/pull/17344/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17344&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323508 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17344.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17344/head:pull/17344 PR: https://git.openjdk.org/jdk/pull/17344 From dholmes at openjdk.org Wed Jan 10 12:26:33 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Jan 2024 12:26:33 GMT Subject: RFR: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 12:09:07 GMT, Stefan Karlsson wrote: > TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. TEST.groups change is good but the other file shouldn't be there. I will hit approve anyway and assume you will fix, as I have to go. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17344#pullrequestreview-1813099354 From stefank at openjdk.org Wed Jan 10 12:26:34 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Jan 2024 12:26:34 GMT Subject: RFR: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups In-Reply-To: References: Message-ID: <1SQLwEN60XAFXfCQTbyXRSM__4lEPhu-Fhhth_LCkX4=.247b7eed-35a5-427c-b9d9-09271b3a1664@github.com> On Wed, 10 Jan 2024 12:09:07 GMT, Stefan Karlsson wrote: > TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. Thanks. The unrelated change has been reverted. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17344#issuecomment-1884749086 From shade at openjdk.org Wed Jan 10 12:33:20 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jan 2024 12:33:20 GMT Subject: RFR: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 12:09:07 GMT, Stefan Karlsson wrote: > TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17344#pullrequestreview-1813120038 From tschatzl at openjdk.org Wed Jan 10 12:52:25 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Jan 2024 12:52:25 GMT Subject: RFR: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 12:09:07 GMT, Stefan Karlsson wrote: > TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17344#pullrequestreview-1813156136 From jwaters at openjdk.org Wed Jan 10 13:06:53 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 10 Jan 2024 13:06:53 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v4] In-Reply-To: References: Message-ID: > Compile the JDK as C++17, enabling the use of all C++17 language features Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'openjdk:master' into patch-7 - Compiler versions in toolchain.m4 - Merge branch 'openjdk:master' into patch-7 - Merge branch 'openjdk:master' into patch-7 - Revert vm_version_linux_riscv.cpp - vm_version_linux_riscv.cpp - allocation.cpp - 8310260 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14988/files - new: https://git.openjdk.org/jdk/pull/14988/files/477f6b94..f1a644e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=02-03 Stats: 20857 lines in 728 files changed: 13941 ins; 3659 del; 3257 mod Patch: https://git.openjdk.org/jdk/pull/14988.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14988/head:pull/14988 PR: https://git.openjdk.org/jdk/pull/14988 From jwaters at openjdk.org Wed Jan 10 13:11:38 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 10 Jan 2024 13:11:38 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: > Compile the JDK as C++17, enabling the use of all C++17 language features Julian Waters has updated the pull request incrementally with one additional commit since the last revision: Remove unnecessary -std=c++17 option in Lib.gmk ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14988/files - new: https://git.openjdk.org/jdk/pull/14988/files/f1a644e3..4f196292 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14988.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14988/head:pull/14988 PR: https://git.openjdk.org/jdk/pull/14988 From stefank at openjdk.org Wed Jan 10 13:28:31 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Jan 2024 13:28:31 GMT Subject: RFR: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups In-Reply-To: References: Message-ID: <6bOyJlxxLDykmaE34rv3xdlE3K3Tx9cPWd_L3HC66qI=.1faecd1e-24df-4002-a734-d511c92283c0@github.com> On Wed, 10 Jan 2024 12:09:07 GMT, Stefan Karlsson wrote: > TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17344#issuecomment-1884845496 From stefank at openjdk.org Wed Jan 10 13:28:32 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Jan 2024 13:28:32 GMT Subject: Integrated: 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 12:09:07 GMT, Stefan Karlsson wrote: > TestGCLockerWithShenandoah.java was recently removed, but the TEST.groups file still has a reference to it. This causes problems in our CI pipeline. This pull request has now been integrated. Changeset: ec385057 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/ec38505720251ceefc8e838bd68b740d166c83c1 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups Reviewed-by: dholmes, shade, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/17344 From mdoerr at openjdk.org Wed Jan 10 13:49:28 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Jan 2024 13:49:28 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 13:11:38 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary -std=c++17 option in Lib.gmk Looks basically still good. The only issue I see is requiring clang 14.0 on MacOS is not in sync with "Other JDK 22 build platforms" (https://wiki.openjdk.org/display/Build/Supported+Build+Platforms). @MBaesken: Do you know if we can use a newer clang? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1884882004 From shade at openjdk.org Wed Jan 10 13:49:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jan 2024 13:49:32 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing Message-ID: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. These tests take quite a bit of time, so I opted to add them to relevant "slow" group that is run in tier3. Additional testing: - [x] Checked that `tier3_compiler` runs CTW tests ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/17348/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17348&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323519 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17348.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17348/head:pull/17348 PR: https://git.openjdk.org/jdk/pull/17348 From mbaesken at openjdk.org Wed Jan 10 13:56:27 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 10 Jan 2024 13:56:27 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 13:11:38 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary -std=c++17 option in Lib.gmk Hi Martin, probably we can update our devkit if really needed. But https://clang.llvm.org/cxx_status.html states that c++17 is supported for a very long time, so probably clang 13.1 is sufficient too (or is there a real showstopper known with this release) . ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1884892708 From sspitsyn at openjdk.org Wed Jan 10 14:10:32 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Jan 2024 14:10:32 GMT Subject: Integrated: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 23:48:35 GMT, Serguei Spitsyn wrote: > This fix adds a ResourceMark missing in the `SetFramePopClosure::do_thread` and `SetFramePopClosure::do_vthread`. > > Testing: > - TBD to submit tiers: 1-5 This pull request has now been integrated. Changeset: 2806adee Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/2806adee2d8cca6bc215f285888631799bd02eac Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf Reviewed-by: amenkov, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/17332 From tschatzl at openjdk.org Wed Jan 10 15:04:46 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Jan 2024 15:04:46 GMT Subject: [jdk22] RFR: 8322987: Remove gc/stress/gclocker/TestGCLocker* since they always fail with OOME Message-ID: Hi all, please review this backport of [JDK-8322987](https://bugs.openjdk.org/browse/JDK-8322987) and [JDK-8323508](https://bugs.openjdk.org/browse/JDK-8323508) to jdk22. The second CR is a bugfix for the first one, and I did not want to risk of CI failures because of pushing them separately. Thanks, Thomas ------------- Commit messages: - This commit adds the change added in JDK-8323508 because otherwise there are CI issues - Backport 40861761c2b0bb5ae548afc4752dc7cee3bf506a Changes: https://git.openjdk.org/jdk22/pull/54/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=54&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322987 Stats: 449 lines in 8 files changed: 0 ins; 449 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/54.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/54/head:pull/54 PR: https://git.openjdk.org/jdk22/pull/54 From aboldtch at openjdk.org Wed Jan 10 15:34:34 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 10 Jan 2024 15:34:34 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v12] In-Reply-To: References: Message-ID: > LM_LIGHTWEIGHT only uses the lock bits for its locking. This leaves the hashCode bits free when a monitor is not inflated. So instead of inflating when installing the hashCode on a fast locked object it can simply use the hashCode bits in the markWord. > > The mark word transitions Unlocked (0b01) <=> Locked (0b00) are done by retrying the CAS if it fails due to non-lock bit changes. > The mark word transitions Monitor (0b10) <=> Locked/Unlocked (0b0X) are the same as before, inflation already handles hash codes. This change does not interact with the mark word if it is in a Monitor (0b10) state, so the strong CAS which is used for deflation are still valid, and will not fail to any other reason than the cooperative race to help transition the mark word during deflation. > > This is dependent on JDK-8319778 simply because JDK-8319797 is dependent on both this and JDK-8319778. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 - Fix copy paste typo. - Update src/hotspot/share/opto/library_call.cpp Co-authored-by: Tobias Hartmann - Add retry CAS comment - Use is_neutral over is_unlocked - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 - ... and 8 more: https://git.openjdk.org/jdk/compare/753d9670...a83ad377 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16603/files - new: https://git.openjdk.org/jdk/pull/16603/files/1b907f90..a83ad377 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16603&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16603&range=10-11 Stats: 83817 lines in 1644 files changed: 47436 ins; 29636 del; 6745 mod Patch: https://git.openjdk.org/jdk/pull/16603.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16603/head:pull/16603 PR: https://git.openjdk.org/jdk/pull/16603 From aboldtch at openjdk.org Wed Jan 10 15:35:42 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 10 Jan 2024 15:35:42 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v10] In-Reply-To: References: Message-ID: > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Avoid copy from and to the same location - Fix typo - Update unstructured unlock comment - Fix bad indent after merge - Remove whitespace - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - ... and 3 more: https://git.openjdk.org/jdk/compare/a83ad377...530ea72b ------------- Changes: https://git.openjdk.org/jdk/pull/16606/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=09 Stats: 676 lines in 10 files changed: 634 ins; 7 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From aboldtch at openjdk.org Wed Jan 10 15:39:40 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 10 Jan 2024 15:39:40 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: Message-ID: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - top load adjustments - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Fix type - Move inflated check in fast_locked - Move top load - 8319799: Recursive lightweight locking: x86 implementation - ... and 1 more: https://git.openjdk.org/jdk/compare/374f434b...71c48af6 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/13f32a39..71c48af6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=06-07 Stats: 83817 lines in 1644 files changed: 47436 ins; 29636 del; 6745 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From aboldtch at openjdk.org Wed Jan 10 15:41:01 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 10 Jan 2024 15:41:01 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v5] In-Reply-To: References: Message-ID: <069DFRA7mVa5pM7-D4BZGEQAJxdztB4O6aF2g_bgl18=.d316c137-16a4-461a-ad69-c58651cd7edd@github.com> > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - 8319801: Recursive lightweight locking: aarch64 implementation - Cleanup: C2 fast_lock/fast_unlock aarch64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/263b3061..8882cddc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=03-04 Stats: 172150 lines in 3471 files changed: 98039 ins; 61299 del; 12812 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From duke at openjdk.org Wed Jan 10 15:45:42 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Wed, 10 Jan 2024 15:45:42 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v11] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/abd3e3cb..628de1a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=09-10 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From duke at openjdk.org Wed Jan 10 15:53:24 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Wed, 10 Jan 2024 15:53:24 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v10] In-Reply-To: References: <9LDr8BFvEh6MrXxBwB6-X3Zy9NTGvGrt5IMxkXCwi-Q=.7e5f042c-6046-4bb6-a3a3-64dbc6ecdbbb@github.com> Message-ID: On Wed, 10 Jan 2024 09:28:42 GMT, Albert Mingkun Yang wrote: > `make test CONF=debug TEST=serviceability/sa/ClhsdbThreadContext.java JTREG_JAVA_OPTIONS=-XX:+UseG1GC` causes "AssertionFailure: Should have unknown location" in my testing. Have a try again? I may fix this problem in the latest version. I wrongly moved 'isInTLAB()' to the upper if-statement before. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1885118367 From cjplummer at openjdk.org Wed Jan 10 17:26:28 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 17:26:28 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v11] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 15:45:42 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > fix src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerLocation.java line 286: > 284: getGeneration().printOn(tty); // does not include "\n" > 285: } > 286: tty.println(); If the pointer is in the heap, but not in the tlab, you don't print anything. But then due to the changes in PointerFinder, isInTLab() will always return false, so you won't print anything even if it is in a tlab. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1447694637 From cjplummer at openjdk.org Wed Jan 10 17:31:25 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 17:31:25 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: Message-ID: <-CbIylmXFWX0z0oN8Ewxs6PH8E5puD_0JF0Kw6xqUFY=.92381e98-6c70-493e-9deb-44e86bdf6a70@github.com> On Wed, 10 Jan 2024 10:16:01 GMT, Hannes Greule wrote: >> You said in the description that the order was reversed, but I don't see where that is getting fixed. It seems it was partially fixed by [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692), but it is only preserving the class hierarchy order, but not the order of fields within each class. If that's all you are attempting to do, then please make it clear in the description. > > FieldStream from reflectionUtils iterates fields in reverse order, so reversing again was previously needed here. JavaFieldStream from fieldStreams (and the new FilteredJavaFieldStream) iterate in the order the fields actually occur, so this double-reversing isn't needed anymore. > > It's a bit confusing to have FilteredJavaFieldStream in reflectionUtils; eventually it would probably make sense to move the FilteredFieldsMap and FilteredjavaFieldStream into fieldStreams instead? Ok. I see now how the old code was actually reversing the order to undo the reversing that was already done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17094#discussion_r1447702578 From sspitsyn at openjdk.org Wed Jan 10 18:19:30 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Jan 2024 18:19:30 GMT Subject: [jdk22] Integrated: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 21:28:04 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a backport of commit [0f8e4e0a](https://github.com/openjdk/jdk/commit/0f8e4e0a81257c678e948c341a241dc0b810494f) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 19 Dec 2023 and was reviewed by Leonid Mesnik and Alan Bateman. > > Thanks! This pull request has now been integrated. Changeset: 865cf888 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk22/commit/865cf888efbdf5533ded8ca39ef706de9b48dc15 Stats: 229 lines in 15 files changed: 196 ins; 0 del; 33 mod 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable Reviewed-by: alanb Backport-of: 0f8e4e0a81257c678e948c341a241dc0b810494f ------------- PR: https://git.openjdk.org/jdk22/pull/23 From shade at openjdk.org Wed Jan 10 18:57:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jan 2024 18:57:34 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates Message-ID: We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. There are a few interesting conversions along the way: 1. `intptr_t` -> `uint32_t` (this method) 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) 3. `int32_t` -> `uint32_t` (in `emit_int32`) I believe these are safe after `is_uimm32` check, but please check (sic) me on this. Note that x86_64 matcher already does similar thing for immediates: // Long Immediate 32-bit unsigned operand immUL32() %{ predicate(n->get_long() == (unsigned int) (n->get_long())); match(ConL); ... %} instruct loadConUL32(rRegL dst, immUL32 src) %{ ... format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} ins_encode %{ __ movl($dst$$Register, $src$$constant); %} %} Additional testing: - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` Code sizes for `Hello World`, `-Xcomp`: # Before tier1 nmethod code size : 426208 bytes tier2 nmethod code size : 462880 bytes tier3 nmethod code size : 889992 bytes tier4 nmethod code size : 1244448 bytes # After tier1 nmethod code size : 425768 bytes (-0.1%) tier2 nmethod code size : 462400 bytes (-0.1%) tier3 nmethod code size : 882072 bytes (-0.8%) tier4 nmethod code size : 1236448 bytes (-0.6%) ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/17343/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323503 Stats: 4 lines in 2 files changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17343.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17343/head:pull/17343 PR: https://git.openjdk.org/jdk/pull/17343 From stuefe at openjdk.org Wed Jan 10 18:57:35 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 10 Jan 2024 18:57:35 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 11:05:03 GMT, Aleksey Shipilev wrote: > We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. > > There are a few interesting conversions along the way: > 1. `intptr_t` -> `uint32_t` (this method) > 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) > 3. `int32_t` -> `uint32_t` (in `emit_int32`) > > I believe these are safe after `is_uimm32` check, but please check (sic) me on this. > > Note that x86_64 matcher already does similar thing for immediates: > > > // Long Immediate 32-bit unsigned > operand immUL32() > %{ > predicate(n->get_long() == (unsigned int) (n->get_long())); > match(ConL); > ... > %} > > instruct loadConUL32(rRegL dst, immUL32 src) > %{ > ... > format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} > ins_encode %{ > __ movl($dst$$Register, $src$$constant); > %} > %} > > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > > Code sizes for `Hello World`, `-Xcomp`: > > > # Before > tier1 nmethod code size : 426208 bytes > tier2 nmethod code size : 462880 bytes > tier3 nmethod code size : 889992 bytes > tier4 nmethod code size : 1244448 bytes > > # After > tier1 nmethod code size : 425768 bytes (-0.1%) > tier2 nmethod code size : 462400 bytes (-0.1%) > tier3 nmethod code size : 882072 bytes (-0.8%) > tier4 nmethod code size : 1236448 bytes (-0.6%) Looks good. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17343#pullrequestreview-1813412675 From xliu at openjdk.org Wed Jan 10 19:42:22 2024 From: xliu at openjdk.org (Xin Liu) Date: Wed, 10 Jan 2024 19:42:22 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing In-Reply-To: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Message-ID: On Wed, 10 Jan 2024 13:43:23 GMT, Aleksey Shipilev wrote: > Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. It should be useful to catch compiler bugs early, before running `hotspot:all`. > > These tests take quite a bit of time (~15 mins on my M1), so I opted to add them to relevant "slow" group that is run in tier3. > > Additional testing: > - [x] Checked that `tier3_compiler` runs CTW tests LGTM. I am not a reviewer. ------------- Marked as reviewed by xliu (Committer). PR Review: https://git.openjdk.org/jdk/pull/17348#pullrequestreview-1814007294 From ayang at openjdk.org Wed Jan 10 19:50:27 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 Jan 2024 19:50:27 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v11] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 15:45:42 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > fix src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerFinder.java line 87: > 85: // Check if address is in the java heap. > 86: CollectedHeap heap = VM.getVM().getUniverse().heap(); > 87: if (heap instanceof GenCollectedHeap) { Maybe just rename this to `SerialHeap` so that this PR doesn't touch much SA code unnecessarily. (The inconsistency among different collectors can be dealt with in other PRs.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1447880653 From kvn at openjdk.org Wed Jan 10 20:32:20 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 10 Jan 2024 20:32:20 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing In-Reply-To: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Message-ID: <6gyoiBYGr7WJudLVQOxMmbUT7CN5N0WoYHxlKi8L48Y=.b2ce9cca-5923-45b5-86c0-30b0f41361ee@github.com> On Wed, 10 Jan 2024 13:43:23 GMT, Aleksey Shipilev wrote: > Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. It should be useful to catch compiler bugs early, before running `hotspot:all`. > > These tests take quite a bit of time (~15 mins on my M1), so I opted to add them to relevant "slow" group that is run in tier3. > > Additional testing: > - [x] Checked that `tier3_compiler` runs CTW tests This will duplicate testing for us in Oracle. Please, add it directly to `tier3_compiler` group instead. ------------- Changes requested by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17348#pullrequestreview-1814114298 From kvn at openjdk.org Wed Jan 10 20:57:22 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 10 Jan 2024 20:57:22 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 11:05:03 GMT, Aleksey Shipilev wrote: > We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. > > There are a few interesting conversions along the way: > 1. `intptr_t` -> `uint32_t` (this method) > 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) > 3. `int32_t` -> `uint32_t` (in `emit_int32`) > > I believe these are safe after `is_uimm32` check, but please check (sic) me on this. > > Note that x86_64 matcher already does similar thing for immediates: > > > // Long Immediate 32-bit unsigned > operand immUL32() > %{ > predicate(n->get_long() == (unsigned int) (n->get_long())); > match(ConL); > ... > %} > > instruct loadConUL32(rRegL dst, immUL32 src) > %{ > ... > format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} > ins_encode %{ > __ movl($dst$$Register, $src$$constant); > %} > %} > > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > > Code sizes for `Hello World`, `-Xcomp`: > > > # Before > tier1 nmethod code size : 426208 bytes > tier2 nmethod code size : 462880 bytes > tier3 nmethod code size : 889992 bytes > tier4 nmethod code size : 1244448 bytes > > # After > tier1 nmethod code size : 425768 bytes (-0.1%) > tier2 nmethod code size : 462400 bytes (-0.1%) > tier3 nmethod code size : 882072 bytes (-0.8%) > tier4 nmethod code size : 1236448 bytes (-0.6%) What about next?: // src should NEVER be a real pointer. Use AddressLiteral for true pointers void MacroAssembler::movptr(Address dst, intptr_t src, Register rscratch) { if (is_simm32(src)) { movptr(dst, checked_cast(src)); ------------- PR Review: https://git.openjdk.org/jdk/pull/17343#pullrequestreview-1814156793 From mdoerr at openjdk.org Wed Jan 10 21:26:27 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Jan 2024 21:26:27 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v12] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 11:49:05 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: > > - cosmetic changes > - cosmetic changes Looks correct to me. I'm not familiar with all AIX details, but they have been reviewed before. Test results look good. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1814204628 From amenkov at openjdk.org Wed Jan 10 22:34:57 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 10 Jan 2024 22:34:57 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v3] In-Reply-To: References: Message-ID: > FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. > It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. > > FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) > > Testing: > - tier1..3 > - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic > including > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; > - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: indent ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17094/files - new: https://git.openjdk.org/jdk/pull/17094/files/76df939f..259365d0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17094&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17094&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17094.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17094/head:pull/17094 PR: https://git.openjdk.org/jdk/pull/17094 From amenkov at openjdk.org Wed Jan 10 23:15:22 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 10 Jan 2024 23:15:22 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: <-CbIylmXFWX0z0oN8Ewxs6PH8E5puD_0JF0Kw6xqUFY=.92381e98-6c70-493e-9deb-44e86bdf6a70@github.com> References: <-CbIylmXFWX0z0oN8Ewxs6PH8E5puD_0JF0Kw6xqUFY=.92381e98-6c70-493e-9deb-44e86bdf6a70@github.com> Message-ID: On Wed, 10 Jan 2024 17:28:33 GMT, Chris Plummer wrote: >> FieldStream from reflectionUtils iterates fields in reverse order, so reversing again was previously needed here. JavaFieldStream from fieldStreams (and the new FilteredJavaFieldStream) iterate in the order the fields actually occur, so this double-reversing isn't needed anymore. >> >> It's a bit confusing to have FilteredJavaFieldStream in reflectionUtils; eventually it would probably make sense to move the FilteredFieldsMap and FilteredjavaFieldStream into fieldStreams instead? > > Ok. I see now how the old code was actually reversing the order to undo the reversing that was already done. > I think the indent here should be 4, not 6. I updated indentation to be consistent with other multi-line statements in the file ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17094#discussion_r1448092338 From cjplummer at openjdk.org Wed Jan 10 23:46:22 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 10 Jan 2024 23:46:22 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v3] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 22:34:57 GMT, Alex Menkov wrote: >> FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). >> The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. >> It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. >> >> FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) >> >> Testing: >> - tier1..3 >> - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic >> including >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; >> - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > indent Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17094#pullrequestreview-1814392197 From amenkov at openjdk.org Thu Jan 11 00:33:22 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 11 Jan 2024 00:33:22 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v2] In-Reply-To: References: <-CbIylmXFWX0z0oN8Ewxs6PH8E5puD_0JF0Kw6xqUFY=.92381e98-6c70-493e-9deb-44e86bdf6a70@github.com> Message-ID: On Wed, 10 Jan 2024 23:12:40 GMT, Alex Menkov wrote: >> Ok. I see now how the old code was actually reversing the order to undo the reversing that was already done. > >> I think the indent here should be 4, not 6. > > I updated indentation to be consistent with other multi-line statements in the file > It's a bit confusing to have FilteredJavaFieldStream in reflectionUtils; eventually it would probably make sense to move the FilteredFieldsMap and FilteredjavaFieldStream into fieldStreams instead? We have only 2 users of FilteredJavaFieldStream - GetClassFields and heap walking API implementation. fieldStreams is light-weight (it has only header files) and used in many places, and FilteredFieldsMap would add dependency on several additional headers. So prefer to keep all this stuff in reflectionUtils (at least for now). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17094#discussion_r1448139416 From amenkov at openjdk.org Thu Jan 11 00:36:23 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 11 Jan 2024 00:36:23 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v3] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 22:34:57 GMT, Alex Menkov wrote: >> FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). >> The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. >> It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. >> >> FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) >> >> Testing: >> - tier1..3 >> - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic >> including >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; >> - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > indent jdi/jdwp/jdb tests passed, tier4..6 are in progress ------------- PR Comment: https://git.openjdk.org/jdk/pull/17094#issuecomment-1885986523 From duke at openjdk.org Thu Jan 11 03:09:35 2024 From: duke at openjdk.org (duke) Date: Thu, 11 Jan 2024 03:09:35 GMT Subject: Withdrawn: JDK-8313764: Offer JVM HS functionality to shared lib load operations done by the JDK codebase In-Reply-To: References: Message-ID: On Mon, 14 Aug 2023 07:48:00 GMT, Matthias Baesken wrote: > Currently there is a number of functionality that would be interesting to have for shared lib load operations in the JDK C code. > Some examples : > Events::log_dll_message for hs-err files reporting > JFR event NativeLibraryLoad > There is the need to update the shared lib Cache on AIX ( see LoadedLibraries::reload() , see also https://bugs.openjdk.org/browse/JDK-8314152 ), > this is currently not fully in sync with libs loaded form jdk c-libs and sometimes reports outdated information > > Offer an interface (e.g. jvm.cpp) to support this. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15264 From duke at openjdk.org Thu Jan 11 05:18:28 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Thu, 11 Jan 2024 05:18:28 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v11] In-Reply-To: References: Message-ID: <01UyMgz6G1PdWOXQmWzp_L6-hosYtfJCj-4cu16TIRc=.3ab9b186-70e5-4045-adac-bf5cd2e65787@github.com> On Wed, 10 Jan 2024 15:45:42 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > fix src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerFinder.java line 115: > 113: } > 114: } > 115: } I don't know if I should restore these codes for TLAB pointer identification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1448301029 From duke at openjdk.org Thu Jan 11 05:26:29 2024 From: duke at openjdk.org (Liming Liu) Date: Thu, 11 Jan 2024 05:26:29 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: <5EWldiaGyRCcTrcjBgRD__x6z6b8Tx-jc3Gin0DGWbo=.17234cfb-1ad6-479e-af3e-45738476871a@github.com> On Thu, 28 Dec 2023 09:26:19 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Use pthread instead Let me restrict the discussion to the case that uses THP on Linux 5.14 or above, as the patch actually does not make changes in other cases. Personally, I don't see the correctness issue about concurrently using and madvising memory, and added a gtest case to cover it. The uses of pretouch do not have to be together with -Xmx = -Xms. Just mutators currently wait pretouch to be finished, and it takes time. This patch actually makes pretouch faster. The shredding issue can also be mitigated for the expanded memory after making using and pretouching concurrent, as 2MB huge pages have a good chance of being integral if them are not accessed by mutators yet, while pretouching via atomic-add makes all of the expanded memory shredded. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1886263866 From cjplummer at openjdk.org Thu Jan 11 05:47:33 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 11 Jan 2024 05:47:33 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v11] In-Reply-To: <01UyMgz6G1PdWOXQmWzp_L6-hosYtfJCj-4cu16TIRc=.3ab9b186-70e5-4045-adac-bf5cd2e65787@github.com> References: <01UyMgz6G1PdWOXQmWzp_L6-hosYtfJCj-4cu16TIRc=.3ab9b186-70e5-4045-adac-bf5cd2e65787@github.com> Message-ID: On Thu, 11 Jan 2024 05:14:50 GMT, Lei Zaakjyu wrote: >> Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: >> >> fix > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerFinder.java line 115: > >> 113: } >> 114: } >> 115: } > > I don't know if I should restore these codes for TLAB pointer identification. Why not? From my understanding of the changes, all that should be needed for SA is to rename GenCollectedHeap to SerialHeap and move (most of) the contents of GenCollectedHeap.java to SerialHeap.java. I would use that as a starting point and see if everything works ok. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1448317595 From dholmes at openjdk.org Thu Jan 11 07:14:22 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Jan 2024 07:14:22 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 10:52:33 GMT, Stefan Karlsson wrote: > So, it seems like the MS compiler previously just skipped the extra includes above the precompiled.hpp include line. >From which I think we can conclude those includes are only needed for when PCH is disabled. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17326#issuecomment-1886428074 From jwaters at openjdk.org Thu Jan 11 08:07:25 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 08:07:25 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 13:53:47 GMT, Matthias Baesken wrote: > Hi Martin, probably we can update our devkit if really needed. But https://clang.llvm.org/cxx_status.html states that c++17 is supported for a very long time, so probably clang 13.1 is sufficient too (or is there a real showstopper known with this release of clang) . I was hoping to avoid 13.x since there seems to be a noexcept bug in that release series, though some other testing seems to suggest this is transient (and also I wanted to align with what Oracle uses, which is 14.x). I guess I can roll back to 13.x if that is really needed > P0283R2: Ignoring unsupported non-standard attributes It's probably important to note that MSVC takes this to mean that unknown attributes don't have an effect, and still warns for them when warning C5030 is enabled (which is by default in our make system): https://developercommunity.visualstudio.com/t/c-warning-c5030-generated-for-attribute-within-a-n/138429 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1886575743 From stuefe at openjdk.org Thu Jan 11 08:23:40 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Jan 2024 08:23:40 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Thu, 28 Dec 2023 09:26:19 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Use pthread instead I think this patch makes sense and should be done. Sorry for dropping the ball on this one. Some suggestions inline. I also would prefer to have a boolean product-diagnostic linux-only switch to disable the madvise pretouching, just to have something that can be disabled when problems appear at a customer and we want to quickly rule out that pretouching is the culprit. We can remove that switch after some time if no issues appeared. src/hotspot/share/runtime/os.cpp line 2118: > 2116: } > 2117: } > 2118: I suggest a slightly different flow, similar to how we do things in other areas: os.hpp private: bool pd_pretouch_memory(..); public: void pretouch_memory(..); os.cpp void os::pretouch_memory(..) { // Ask platform first if (pd_pretouch_memory(..)) { return; } ... do pretouching by touching } then provide a pd_pretouch for every platform; let other platforms be just a noop returning false, on Linux - if THPs are enabled and so forth, do the madvise and return true. One function less in the os namespace, and we don't call back from a pd_... function into a generic function which is unusual. test/hotspot/jtreg/runtime/os/TestTransparentHugePageUsage.java line 46: > 44: import java.util.regex.Matcher; > 45: import java.util.regex.Pattern; > 46: import jdk.test.lib.process.ProcessTools; Please add a comment describing what the test does. E.g. "Tests checks that a pretouched java heap appears to use THPs by checking AnonHugePages in smaps". Feel free to find a better formulation. Does the test fail without madvise, at least sporadically? So I wonder whether it would be better to run with SerialGC, to limit the pretouching to one thread. That would increase the time window needed for pretouching and give us a higher chance to observe those small pages that appear before khugepaged gets around merging them into THPs. ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15781#pullrequestreview-1814890863 PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1448437292 PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1448462880 From stuefe at openjdk.org Thu Jan 11 08:35:32 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Jan 2024 08:35:32 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: <_BwiPEQGivAMBltfX2w0QT51j62KH3uFc2mJMswHTJQ=.3b07eabe-c933-4fb3-9409-4b0fcb6e0e14@github.com> On Thu, 28 Dec 2023 09:26:19 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Use pthread instead test/hotspot/jtreg/runtime/os/TestTransparentHugePageUsage.java line 96: > 94: .map(e -> Long.valueOf(e.getKey().substring(e.getValue().start(1), e.getValue().end(1)))); > 95: if (!usage.isPresent()) throw new RuntimeException("The usage of THP was not found."); > 96: if (usage.get() == 0) throw new RuntimeException("The usage of THP should not be zero."); The effect we would see without your patch would be small pages that are then converted to huge pages by khugepaged in its own time, right? So, maybe test that AnonHugePages == Size for the heap VMA? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1448478423 From aboldtch at openjdk.org Thu Jan 11 08:54:08 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Jan 2024 08:54:08 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v12] In-Reply-To: References: <2MRTHFoYSaSW2NH922LOEvqKx4NLjshWaHJaYV2RdVY=.e234046a-aac8-4d7b-81b9-269506944165@github.com> Message-ID: On Tue, 9 Jan 2024 21:50:54 GMT, Daniel D. Daugherty wrote: >> There might be some confusion about what I am asking for here. >> >> This enhancement is to avoid inflating monitors when installing hash codes on objects with LM_LIGHTWEIGHT. The current state of the PR does this except for when it is racing with deflation. It is very possible to avoid inflating for the race as well. The question is not whether the race is handled, rather that it could be handled in such a way that installing a hash code would never cause monitor inflation. >> >> My question in this thread is whether we should handle this case. >> >> As already stated my opinion is let the race be handled by inflating and accept that we get some occasional `InflateCause::inflate_cause_hash_code` even with LM_LIGHTWEIGHT. But I do believe that there should be a comment about this. >> >> And if the consensus is to instead handle the race by retrying (and thus avoiding inflation completely), then we should split out the lightweight FastHashCode into its own loop. >> >>> We got into this spiraling thread because we were trying to figure out if a >>> non-JavaThread could call `inflate()` because `inflate()` can call `is_lock_owned()` >>> which has a header comment which talks about non-JavaThreads... >> >> I think it ultimately is because this enhancement claims to avoid inflating monitors, so why would `is_lock_owned()` be needed, but it is not the case as the current implementation does not handle the potential race with deflation. >> >> I wanted to add a comment to make it clear that this is known and intentional. > > Okay I've re-read this group of comments multiple times and above you wrote: > >> Regardless if we were to just go with it as it is now there should probably be a comment here along the line: > > >> // With LM_LIGHTWEIGHT FastHashCode may race with deflation here and cause a monitor to be re-inflated. > > > To fully understand whether the comment is correct or not, I need to understand > where the "here" is that you want to place this comment. You deleted old line 972 > thru old line 978. Is this new comment going to be a replacement for those lines? > Or will the new comment be somewhere else? Before the call to inflate which is the only control flow path which can observe the race. But it could also be a more general comment overarching the whole FastHashCode. The race occurs between reading the mark in FastHashCode and reading the mark in inflate. The ObjectMonitor found in the first mark read may differ from the one returned from inflate, and it may be that the current thread which called FastHashCode is the one that inflates the monitor. I believe we can leave this enhancement as is, it solves the common case by avoiding inflation when an object is lightweight locked. The race where a re-inflation may occur can be plugged as a separate enhancement. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16603#discussion_r1448504292 From shade at openjdk.org Thu Jan 11 08:56:47 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 Jan 2024 08:56:47 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing [v2] In-Reply-To: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Message-ID: > Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. It should be useful to catch compiler bugs early, before running `hotspot:all`. > > These tests take quite a bit of time (~15 mins on my M1), so I opted to add them to relevant "slow" group that is run in tier3. > > Additional testing: > - [x] Checked that `tier3_compiler` runs CTW tests Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Adding directly to tier3_compiler ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17348/files - new: https://git.openjdk.org/jdk/pull/17348/files/b2694c10..630adcd4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17348&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17348&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17348.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17348/head:pull/17348 PR: https://git.openjdk.org/jdk/pull/17348 From shade at openjdk.org Thu Jan 11 08:56:48 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 Jan 2024 08:56:48 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing [v2] In-Reply-To: <6gyoiBYGr7WJudLVQOxMmbUT7CN5N0WoYHxlKi8L48Y=.b2ce9cca-5923-45b5-86c0-30b0f41361ee@github.com> References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> <6gyoiBYGr7WJudLVQOxMmbUT7CN5N0WoYHxlKi8L48Y=.b2ce9cca-5923-45b5-86c0-30b0f41361ee@github.com> Message-ID: On Wed, 10 Jan 2024 20:29:27 GMT, Vladimir Kozlov wrote: > This will duplicate testing for us in Oracle. Please, add it directly to `tier3_compiler` group instead. Now directly in `tier3_compiler`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17348#issuecomment-1886651412 From stefank at openjdk.org Thu Jan 11 08:57:36 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 11 Jan 2024 08:57:36 GMT Subject: RFR: 8322957: Generational ZGC: Relocation selection must join the STS Message-ID: The concurrent ZGC threads don't automatically participate in the safepoint protocol, which means that they can run concurrently with safepoint VM Operations. Instead they use other means to hook into the safepoint protocol whenever they need to make changes that could be racing with the various VM Operations. The most common way is to join the "suspendible thread set". For details around this see `SafepointSynchronize::begin` and the call to `Universe::heap()->safepoint_synchronize_begin()`. It turns out that the relocation selection phase was updated to use a call oop_iterate, to modify oops of some of the objects. This was done without having the GC threads join the suspendible thread set. This means that various VM Operations could run concurrently with the oop_iterate. This caused the failure described in JDK-8322957: The JFR Leak Profiler modified the object header bits, while the GC's oop_iterate function used the same bits to determine if the oop iteration over an object should be skipped. This lead to objects not being modified as they were supposed to, which lead to broken oops and asserts. The fix is quite small and could be limited to the lines added to [src/hotspot/share/gc/z/zRelocationSet.cpp](https://github.com/openjdk/jdk/compare/master...stefank:jdk:8322957_sts_with_relocation_selection?expand=1#diff-883b7a72f757c1c5331769ad4a5c763335d0267ee33a0bc06896fa16d89ea58f). However, to lower the risk of reintroducing a bug like this again, we've added extra verification code. Some of the infrastructure to get the correct verification is placed outside of the GC code, and that's why this PR is sent to the hotspot-dev list. This has been tested with the reproducer of the original bug + tier1-7 on linux-x64-debug. ------------- Commit messages: - 8322957: Generational ZGC: Relocation selection must join the STS Changes: https://git.openjdk.org/jdk/pull/17368/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17368&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322957 Stats: 164 lines in 14 files changed: 123 ins; 20 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/17368.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17368/head:pull/17368 PR: https://git.openjdk.org/jdk/pull/17368 From shade at openjdk.org Thu Jan 11 09:10:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 Jan 2024 09:10:34 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v2] In-Reply-To: References: Message-ID: > We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. > > There are a few interesting conversions along the way: > 1. `intptr_t` -> `uint32_t` (this method) > 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) > 3. `int32_t` -> `uint32_t` (in `emit_int32`) > > I believe these are safe after `is_uimm32` check, but please check (sic) me on this. > > Note that x86_64 matcher already does similar thing for immediates: > > > // Long Immediate 32-bit unsigned > operand immUL32() > %{ > predicate(n->get_long() == (unsigned int) (n->get_long())); > match(ConL); > ... > %} > > instruct loadConUL32(rRegL dst, immUL32 src) > %{ > ... > format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} > ins_encode %{ > __ movl($dst$$Register, $src$$constant); > %} > %} > > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > > Code sizes for `Hello World`, `-Xcomp`: > > > # Before > tier1 nmethod code size : 426208 bytes > tier2 nmethod code size : 462880 bytes > tier3 nmethod code size : 889992 bytes > tier4 nmethod code size : 1244448 bytes > > # After > tier1 nmethod code size : 425768 bytes (-0.1%) > tier2 nmethod code size : 462400 bytes (-0.1%) > tier3 nmethod code size : 882072 bytes (-0.8%) > tier4 nmethod code size : 1236448 bytes (-0.6%) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Just do checked_cast ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17343/files - new: https://git.openjdk.org/jdk/pull/17343/files/261802ea..3f94218b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17343.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17343/head:pull/17343 PR: https://git.openjdk.org/jdk/pull/17343 From shade at openjdk.org Thu Jan 11 09:17:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 Jan 2024 09:17:35 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: > We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. > > There are a few interesting conversions along the way: > 1. `intptr_t` -> `uint32_t` (this method) > 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) > 3. `int32_t` -> `uint32_t` (in `emit_int32`) > > I believe these are safe after `is_uimm32` check, but please check (sic) me on this. > > Note that x86_64 matcher already does similar thing for immediates: > > > // Long Immediate 32-bit unsigned > operand immUL32() > %{ > predicate(n->get_long() == (unsigned int) (n->get_long())); > match(ConL); > ... > %} > > instruct loadConUL32(rRegL dst, immUL32 src) > %{ > ... > format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} > ins_encode %{ > __ movl($dst$$Register, $src$$constant); > %} > %} > > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > > Code sizes for `Hello World`, `-Xcomp`: > > > # Before > tier1 nmethod code size : 426208 bytes > tier2 nmethod code size : 462880 bytes > tier3 nmethod code size : 889992 bytes > tier4 nmethod code size : 1244448 bytes > > # After > tier1 nmethod code size : 425768 bytes (-0.1%) > tier2 nmethod code size : 462400 bytes (-0.1%) > tier3 nmethod code size : 882072 bytes (-0.8%) > tier4 nmethod code size : 1236448 bytes (-0.6%) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Revert "Just do checked_cast" This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17343/files - new: https://git.openjdk.org/jdk/pull/17343/files/3f94218b..f50510ce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17343.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17343/head:pull/17343 PR: https://git.openjdk.org/jdk/pull/17343 From shade at openjdk.org Thu Jan 11 09:32:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 Jan 2024 09:32:21 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 20:54:29 GMT, Vladimir Kozlov wrote: > What about next?: > > ``` > // src should NEVER be a real pointer. Use AddressLiteral for true pointers > void MacroAssembler::movptr(Address dst, intptr_t src, Register rscratch) { > if (is_simm32(src)) { > movptr(dst, checked_cast(src)); > ``` Ah, hm! That is the version with `Address dst`. I don't think this zero-extending trick with `movl` and 32-bit destination register works here, because there is no register. We need the affect the full 64-bit of memory, which is what `movptr(Address, int32_t)` -> `movslq` would already give. I don't see any other missing shortcuts here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17343#issuecomment-1886708920 From kbarrett at openjdk.org Thu Jan 11 09:44:27 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 11 Jan 2024 09:44:27 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 08:04:40 GMT, Julian Waters wrote: > > Hi Martin, probably we can update our devkit if really needed. But https://clang.llvm.org/cxx_status.html states that c++17 is supported for a very long time, so probably clang 13.1 is sufficient too (or is there a real showstopper known with this release of clang) . > > I was hoping to avoid 13.x since there seems to be a noexcept bug in that release series, though some other testing seems to suggest this is transient (and also I wanted to align with what Oracle uses, which is 14.x). I guess I can roll back to 13.x if that is really needed See https://bugs.openjdk.org/browse/JDK-8255082, comments in 12/2023. > > P0283R2: Ignoring unsupported non-standard attributes > > It's probably important to note that MSVC takes this to mean that unknown attributes don't have an effect, and still warns for them when warning C5030 is enabled (which is by default in our make system): https://developercommunity.visualstudio.com/t/c-warning-c5030-generated-for-attribute-within-a-n/138429 So VS doesn't have a mechanism for disabling warnings about just scoped attributes it doesn't recognize. (gcc has -Wno-attributes=_vendor_:: for this.) That's unfortunate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1886727940 From stefank at openjdk.org Thu Jan 11 10:03:44 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 11 Jan 2024 10:03:44 GMT Subject: RFR: 8322957: Generational ZGC: Relocation selection must join the STS [v2] In-Reply-To: References: Message-ID: > The concurrent ZGC threads don't automatically participate in the safepoint protocol, which means that they can run concurrently with safepoint VM Operations. Instead they use other means to hook into the safepoint protocol whenever they need to make changes that could be racing with the various VM Operations. The most common way is to join the "suspendible thread set". For details around this see `SafepointSynchronize::begin` and the call to `Universe::heap()->safepoint_synchronize_begin()`. > > It turns out that the relocation selection phase was updated to use a call oop_iterate, to modify oops of some of the objects. This was done without having the GC threads join the suspendible thread set. This means that various VM Operations could run concurrently with the oop_iterate. This caused the failure described in JDK-8322957: The JFR Leak Profiler modified the object header bits, while the GC's oop_iterate function used the same bits to determine if the oop iteration over an object should be skipped. This lead to objects not being modified as they were supposed to, which lead to broken oops and asserts. > > The fix is quite small and could be limited to the lines added to [src/hotspot/share/gc/z/zRelocationSet.cpp](https://github.com/openjdk/jdk/compare/master...stefank:jdk:8322957_sts_with_relocation_selection?expand=1#diff-883b7a72f757c1c5331769ad4a5c763335d0267ee33a0bc06896fa16d89ea58f). However, to lower the risk of reintroducing a bug like this again, we've added extra verification code. Some of the infrastructure to get the correct verification is placed outside of the GC code, and that's why this PR is sent to the hotspot-dev list. > > This has been tested with the reproducer of the original bug + tier1-7 on linux-x64-debug. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Fix release builds ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17368/files - new: https://git.openjdk.org/jdk/pull/17368/files/54218a84..6e5664c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17368&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17368&range=00-01 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17368.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17368/head:pull/17368 PR: https://git.openjdk.org/jdk/pull/17368 From mli at openjdk.org Thu Jan 11 10:15:23 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Jan 2024 10:15:23 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v3] In-Reply-To: References: <2LzKv6TzZ3ZJDuLOm1GpNcgoCCfZgOqEOtWDNRQs7O0=.2ced11c4-bba6-4e15-bfa1-f0ca06d53610@github.com> Message-ID: On Wed, 10 Jan 2024 03:02:09 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> round 1 review > > Simply ran `micro:java.security.MessageDigests` JMH on my Lichee-pi-4a board, seems there is a small regression for the `MessageDigests.getAndDigest` (length = 64) case: > > > Before: > MessageDigests.digest SHA-1 64 DEFAULT thrpt 15 417.311 ? 2.686 ops/ms > MessageDigests.digest SHA-1 16384 DEFAULT thrpt 15 5.206 ? 0.008 ops/ms > MessageDigests.getAndDigest SHA-1 64 DEFAULT thrpt 15 404.769 ? 14.810 ops/ms > MessageDigests.getAndDigest SHA-1 16384 DEFAULT thrpt 15 5.106 ? 0.046 ops/ms > > After: > MessageDigests.digest SHA-1 64 DEFAULT thrpt 15 518.057 ? 5.935 ops/ms > MessageDigests.digest SHA-1 16384 DEFAULT thrpt 15 5.569 ? 0.009 ops/ms > MessageDigests.getAndDigest SHA-1 64 DEFAULT thrpt 15 378.184 ? 37.425 ops/ms > MessageDigests.getAndDigest SHA-1 16384 DEFAULT thrpt 15 5.515 ? 0.017 ops/ms @RealFYang My bad, I think I added some options in my test scripts accidently, which cause the perf data is not right. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1886781509 From mli at openjdk.org Thu Jan 11 10:29:24 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Jan 2024 10:29:24 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:14:05 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. >> >> >> **perf data summary** >> >> tests intrinsic (ns/op) base (ns/op) speed up (times) >> o.o.b.java.security.MessageDigests.digest (64) 3454.207 12026.787 3.48 >> o.o.b.java.security.MessageDigests.digest (16384) 184063.834 1307913.534 7.11 >> o.o.b.java.security.MessageDigests.getAndDigest (64) 8260.011 17707.156 2.14 >> o.o.b.java.security.MessageDigests.getAndDigest (16384) 191325.246 1379660.864 7.21 >> o.o.b.javax.crypto.full.MacBench.mac (128) 8220.886 34101.577 4.15 >> o.o.b.javax.crypto.full.MacBench.mac (1024) 18006.955 107906.128 5.99 >> o.o.b.javax.crypto.small.MessageDigestBench.digest 11688843.558 82834313.280 7.09 >> >> >> >> **raw perf data - when intrinsic is enabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op >> o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? ... > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - remove tp/gp > - refine code The reason why there is some regression in getAndDigest when size == 64 is, 1. for test MessageDigests.getAndDigest, it's actually j.s.MessageDigest.getInstance + j.s.MessageDigest.digest; 2. for j.s.MessageDigest.getInstance, we don't make any improvement, during the jmh test the performance jitter is kind of big, which I show below with dozens of runs. 3. for j.s.MessageDigest.digest, we made the improvement, but when the size == 64, the improvement is not big enough to fill the gap introduced by performance jitter introduced by j.s.MessageDigest.getInstance. 4. so, combine above together, the performance "regression" in getAndDigest when size == 64, should be performance jitter introduced by j.s.MessageDigest.getInstance, which I also show below with dozens of runs. performance jitter of j.s.MessageDigest.getInstance loop ... 1 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 674.589 ? 21.290 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 715.351 ? 16.302 ns/op loop ... 2 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 744.618 ? 14.846 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 663.602 ? 15.462 ns/op loop ... 3 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 695.022 ? 17.110 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 684.499 ? 17.883 ns/op loop ... 4 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 680.238 ? 15.415 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 696.098 ? 13.663 ns/op loop ... 5 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 676.999 ? 20.736 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 678.875 ? 14.123 ns/op loop ... 6 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 680.661 ? 15.219 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 675.950 ? 16.287 ns/op loop ... 7 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 666.744 ? 17.272 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 690.439 ? 17.091 ns/op loop ... 8 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 684.327 ? 16.777 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 663.682 ? 16.646 ns/op loop ... 9 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 688.047 ? 14.885 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 677.252 ? 17.249 ns/op loop ... 10 GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 684.169 ? 15.243 ns/op GetMessageDigest.getInstance SHA-1 N/A N/A avgt 20 756.130 ? 6.580 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1886805614 PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1886807176 From mli at openjdk.org Thu Jan 11 10:34:25 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Jan 2024 10:34:25 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:14:05 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. >> >> >> **perf data summary** >> >> tests intrinsic (ns/op) base (ns/op) speed up (times) >> o.o.b.java.security.MessageDigests.digest (64) 3454.207 12026.787 3.48 >> o.o.b.java.security.MessageDigests.digest (16384) 184063.834 1307913.534 7.11 >> o.o.b.java.security.MessageDigests.getAndDigest (64) 8260.011 17707.156 2.14 >> o.o.b.java.security.MessageDigests.getAndDigest (16384) 191325.246 1379660.864 7.21 >> o.o.b.javax.crypto.full.MacBench.mac (128) 8220.886 34101.577 4.15 >> o.o.b.javax.crypto.full.MacBench.mac (1024) 18006.955 107906.128 5.99 >> o.o.b.javax.crypto.small.MessageDigestBench.digest 11688843.558 82834313.280 7.09 >> >> >> >> **raw perf data - when intrinsic is enabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op >> o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? ... > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - remove tp/gp > - refine code performance jitter of MessageDigests.getAndDigest (when size == 64) introduced by j.s.MessageDigest.getInstance loop ... 1 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2521.461 ? 16.450 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2428.247 ? 10.967 ns/op loop ... 2 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2436.408 ? 32.545 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2424.811 ? 11.521 ns/op loop ... 3 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2484.609 ? 9.321 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2462.019 ? 16.622 ns/op loop ... 4 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2550.315 ? 32.915 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2459.512 ? 95.334 ns/op loop ... 5 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2852.858 ? 45.638 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2472.366 ? 87.208 ns/op loop ... 6 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2580.812 ? 41.330 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2452.006 ? 86.735 ns/op loop ... 7 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2427.592 ? 44.410 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2437.358 ? 92.754 ns/op loop ... 8 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2438.203 ? 20.321 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2633.727 ? 72.077 ns/op loop ... 9 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2424.485 ? 24.730 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2464.582 ? 14.058 ns/op loop ... 10 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2408.087 ? 34.954 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2464.284 ? 65.394 ns/op loop ... 1 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2452.427 ? 12.069 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2451.976 ? 6.551 ns/op loop ... 2 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2428.869 ? 8.818 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2520.389 ? 8.184 ns/op loop ... 3 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2511.975 ? 16.673 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2509.494 ? 21.087 ns/op loop ... 4 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2483.784 ? 16.029 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2512.870 ? 9.201 ns/op loop ... 5 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2441.026 ? 9.222 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2431.875 ? 9.295 ns/op loop ... 6 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2402.302 ? 9.737 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2474.198 ? 6.352 ns/op loop ... 7 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2484.982 ? 13.996 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2444.898 ? 14.270 ns/op loop ... 8 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2565.433 ? 10.722 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2469.290 ? 32.165 ns/op loop ... 9 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2527.289 ? 18.710 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2480.432 ? 9.804 ns/op loop ... 10 MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2474.362 ? 23.856 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 20 2433.547 ? 30.935 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1886811336 From duke at openjdk.org Thu Jan 11 11:00:33 2024 From: duke at openjdk.org (duke) Date: Thu, 11 Jan 2024 11:00:33 GMT Subject: Withdrawn: 8319200: Don't use test thread factory in ProcessTools.createLimitedTestJavaProcessBuilder() In-Reply-To: <3damdMQpRBrkUN2S32tBD0Tmrl2tmSqA31NniV8FzHU=.d3a36aa7-d5f4-4a63-b2ff-8b9b616a9637@github.com> References: <3damdMQpRBrkUN2S32tBD0Tmrl2tmSqA31NniV8FzHU=.d3a36aa7-d5f4-4a63-b2ff-8b9b616a9637@github.com> Message-ID: On Wed, 1 Nov 2023 00:06:35 GMT, Leonid Mesnik wrote: > Test thread factory is a mode similar to VM flags and should not be used in ProcessTools.createLimitedTestJavaProcessBuilder(). Only createTestJavaProcessBuilder() should use it like jtreg VM options. > > Adding the test thread factory requires the injection of arguments in the middle of the list. I don't think it makes sense to modify arguments in several places so I replaced it with the flag isLimited and moved all modifications in createJavaProcessBuilder(). > > Testing tier1-5. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16442 From mdoerr at openjdk.org Thu Jan 11 11:35:25 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 11 Jan 2024 11:35:25 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 13:11:38 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary -std=c++17 option in Lib.gmk Regarding https://github.com/TheShermanTanker/jdk/actions/runs/7070564987/job/19247370401, could it be that the adlc build didn't get the correct C++ version flags? It doesn't look like a clang 13 specific problem. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1886947773 From ihse at openjdk.org Thu Jan 11 12:20:26 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 11 Jan 2024 12:20:26 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 11:33:07 GMT, Martin Doerr wrote: >> Julian Waters has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unnecessary -std=c++17 option in Lib.gmk > > Regarding https://github.com/TheShermanTanker/jdk/actions/runs/7070564987/job/19247370401, could it be that the adlc build didn't get the correct C++ version flags? It doesn't look like a clang 13 specific problem. @TheRealMDoerr The adlc build is notoriously problematic, since it does not share the common flags set for JVM or JDK native compilation. :( So your suggestion sounds highly likely to me. Running with LOG=cmdlines will confirm this. (This can be done on GHA by manually starting a run, and setting the value of "Additional make arguments" to `LOG=cmdlines` or possibly `LOG=info,cmdlines`) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887040102 From eosterlund at openjdk.org Thu Jan 11 12:23:26 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 11 Jan 2024 12:23:26 GMT Subject: RFR: 8322957: Generational ZGC: Relocation selection must join the STS [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 10:03:44 GMT, Stefan Karlsson wrote: >> The concurrent ZGC threads don't automatically participate in the safepoint protocol, which means that they can run concurrently with safepoint VM Operations. Instead they use other means to hook into the safepoint protocol whenever they need to make changes that could be racing with the various VM Operations. The most common way is to join the "suspendible thread set". For details around this see `SafepointSynchronize::begin` and the call to `Universe::heap()->safepoint_synchronize_begin()`. >> >> It turns out that the relocation selection phase was updated to use a call oop_iterate, to modify oops of some of the objects. This was done without having the GC threads join the suspendible thread set. This means that various VM Operations could run concurrently with the oop_iterate. This caused the failure described in JDK-8322957: The JFR Leak Profiler modified the object header bits, while the GC's oop_iterate function used the same bits to determine if the oop iteration over an object should be skipped. This lead to objects not being modified as they were supposed to, which lead to broken oops and asserts. >> >> The fix is quite small and could be limited to the lines added to [src/hotspot/share/gc/z/zRelocationSet.cpp](https://github.com/openjdk/jdk/compare/master...stefank:jdk:8322957_sts_with_relocation_selection?expand=1#diff-883b7a72f757c1c5331769ad4a5c763335d0267ee33a0bc06896fa16d89ea58f). However, to lower the risk of reintroducing a bug like this again, we've added extra verification code. Some of the infrastructure to get the correct verification is placed outside of the GC code, and that's why this PR is sent to the hotspot-dev list. >> >> This has been tested with the reproducer of the original bug + tier1-7 on linux-x64-debug. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix release builds Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17368#pullrequestreview-1815420747 From ihse at openjdk.org Thu Jan 11 12:25:27 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 11 Jan 2024 12:25:27 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: <7ArB9xSHqaTe7CYyyby5nwaZPwOfRtiJe9vWB505iCA=.c56b7cef-58da-4ec3-b3bb-8131798ea087@github.com> On Thu, 11 Jan 2024 11:33:07 GMT, Martin Doerr wrote: >> Julian Waters has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unnecessary -std=c++17 option in Lib.gmk > > Regarding https://github.com/TheShermanTanker/jdk/actions/runs/7070564987/job/19247370401, could it be that the adlc build didn't get the correct C++ version flags? It doesn't look like a clang 13 specific problem. @TheRealMDoerr > The only issue I see is requiring clang 14.0 on MacOS is not in sync with "Other JDK 22 build platforms" (https://wiki.openjdk.org/display/Build/Supported+Build+Platforms). That page is suppose to document what we actually do, not be a binding contract; so if we change stuff, we update the page to reflect it, rather than the other way around. Or maybe I misunderstood your comment? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887044785 From ihse at openjdk.org Thu Jan 11 12:25:29 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 11 Jan 2024 12:25:29 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 13:11:38 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary -std=c++17 option in Lib.gmk Also please note that if the minimum version of the compilers are bumped in the configure script, the documentation in doc/building.md needs to be updated to match this as well. (And building.html file needs to be regenerated.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887049046 From mdoerr at openjdk.org Thu Jan 11 12:39:25 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 11 Jan 2024 12:39:25 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: <6-D8O4HRLQMAB7ScBXQ1nxzQhqePbELmj1MQJ_Id928=.65d89b70-2d16-407e-97f4-666290f7f083@github.com> On Thu, 11 Jan 2024 11:33:07 GMT, Martin Doerr wrote: >> Julian Waters has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unnecessary -std=c++17 option in Lib.gmk > > Regarding https://github.com/TheShermanTanker/jdk/actions/runs/7070564987/job/19247370401, could it be that the adlc build didn't get the correct C++ version flags? It doesn't look like a clang 13 specific problem. > @TheRealMDoerr > > > The only issue I see is requiring clang 14.0 on MacOS is not in sync with "Other JDK 22 build platforms" (https://wiki.openjdk.org/display/Build/Supported+Build+Platforms). > > That page is suppose to document what we actually do, not be a binding contract; so if we change stuff, we update the page to reflect it, rather than the other way around. > > Or maybe I misunderstood your comment? Correct, but raising requirements requires extra effort to change the build environments, updating docs, etc. (It may even cause incompatibilities. Probably not in this case.) While it may be better to use a newer Xcode on Mac, I can't see sufficient reason for forcing the whole world to build with clang 14. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887072454 From mdoerr at openjdk.org Thu Jan 11 12:50:24 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 11 Jan 2024 12:50:24 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: <6-D8O4HRLQMAB7ScBXQ1nxzQhqePbELmj1MQJ_Id928=.65d89b70-2d16-407e-97f4-666290f7f083@github.com> References: <6-D8O4HRLQMAB7ScBXQ1nxzQhqePbELmj1MQJ_Id928=.65d89b70-2d16-407e-97f4-666290f7f083@github.com> Message-ID: On Thu, 11 Jan 2024 12:36:31 GMT, Martin Doerr wrote: >> Regarding https://github.com/TheShermanTanker/jdk/actions/runs/7070564987/job/19247370401, could it be that the adlc build didn't get the correct C++ version flags? It doesn't look like a clang 13 specific problem. > >> @TheRealMDoerr >> >> > The only issue I see is requiring clang 14.0 on MacOS is not in sync with "Other JDK 22 build platforms" (https://wiki.openjdk.org/display/Build/Supported+Build+Platforms). >> >> That page is suppose to document what we actually do, not be a binding contract; so if we change stuff, we update the page to reflect it, rather than the other way around. >> >> Or maybe I misunderstood your comment? > > Correct, but raising requirements requires extra effort to change the build environments, updating docs, etc. (It may even cause incompatibilities. Probably not in this case.) While it may be better to use a newer Xcode on Mac, I can't see sufficient reason for forcing the whole world to build with clang 14. > @TheRealMDoerr The adlc build is notoriously problematic, since it does not share the common flags set for JVM or JDK native compilation. :( So your suggestion sounds highly likely to me. Running with LOG=cmdlines will confirm this. > > (This can be done on GHA by manually starting a run, and setting the value of "Additional make arguments" to `LOG=cmdlines` or possibly `LOG=info,cmdlines`) Thanks for the hint! The command line is also shown here: make-support/failure-logs/hotspot_variant-server_tools_adlc_objs_adlparse.o.cmdline The -std option is not passed. That seems to be the issue. So, this is not a clang 13 vs 14 thing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887095112 From jwaters at openjdk.org Thu Jan 11 12:56:29 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 12:56:29 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: <6-D8O4HRLQMAB7ScBXQ1nxzQhqePbELmj1MQJ_Id928=.65d89b70-2d16-407e-97f4-666290f7f083@github.com> Message-ID: On Thu, 11 Jan 2024 12:47:25 GMT, Martin Doerr wrote: > @TheRealMDoerr The adlc build is notoriously problematic, since it does not share the common flags set for JVM or JDK native compilation. :( So your suggestion sounds highly likely to me. Running with LOG=cmdlines will confirm this. > > (This can be done on GHA by manually starting a run, and setting the value of "Additional make arguments" to `LOG=cmdlines` or possibly `LOG=info,cmdlines`) Doesn't ADLC share the same compilation standard options as the rest of the codebase though? https://github.com/openjdk/jdk/blob/e5aed6be7a184a86a32fa671d48e0781fab54183/make/autoconf/flags-cflags.m4#L587 > > @TheRealMDoerr The adlc build is notoriously problematic, since it does not share the common flags set for JVM or JDK native compilation. :( So your suggestion sounds highly likely to me. Running with LOG=cmdlines will confirm this. > > (This can be done on GHA by manually starting a run, and setting the value of "Additional make arguments" to `LOG=cmdlines` or possibly `LOG=info,cmdlines`) > > Thanks for the hint! The command line is also shown here: make-support/failure-logs/hotspot_variant-server_tools_adlc_objs_adlparse.o.cmdline The -std option is not passed. That seems to be the issue. So, this is not a clang 13 vs 14 thing. Something is very wrong in that case, they're supposed be be set here: https://github.com/openjdk/jdk/blob/e5aed6be7a184a86a32fa671d48e0781fab54183/make/hotspot/gensrc/GensrcAdlc.gmk#L54 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887106666 From ihse at openjdk.org Thu Jan 11 13:01:26 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 11 Jan 2024 13:01:26 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 13:11:38 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary -std=c++17 option in Lib.gmk There is a typo in adlc: diff --git a/make/hotspot/gensrc/GensrcAdlc.gmk b/make/hotspot/gensrc/GensrcAdlc.gmk index 0898d91e1c2..bb356476847 100644 --- a/make/hotspot/gensrc/GensrcAdlc.gmk +++ b/make/hotspot/gensrc/GensrcAdlc.gmk @@ -51,7 +51,7 @@ ifeq ($(call check-jvm-feature, compiler2), true) endif # Set the C++ standard - ADLC_CFLAGS += $(ADLC_LANGSTD_CXXFLAG) + ADLC_CFLAGS += $(ADLC_LANGSTD_CXXFLAGS) # NOTE: The old build didn't set -DASSERT for windows but it doesn't seem to # hurt. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887114509 From jwaters at openjdk.org Thu Jan 11 13:07:30 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 13:07:30 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 12:58:43 GMT, Magnus Ihse Bursie wrote: > There is a typo in adlc: > > ``` > diff --git a/make/hotspot/gensrc/GensrcAdlc.gmk b/make/hotspot/gensrc/GensrcAdlc.gmk > index 0898d91e1c2..bb356476847 100644 > --- a/make/hotspot/gensrc/GensrcAdlc.gmk > +++ b/make/hotspot/gensrc/GensrcAdlc.gmk > @@ -51,7 +51,7 @@ ifeq ($(call check-jvm-feature, compiler2), true) > endif > > # Set the C++ standard > - ADLC_CFLAGS += $(ADLC_LANGSTD_CXXFLAG) > + ADLC_CFLAGS += $(ADLC_LANGSTD_CXXFLAGS) > > # NOTE: The old build didn't set -DASSERT for windows but it doesn't seem to > # hurt. > ``` *Facepalm... I can't believe I missed something so obvious ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1887124280 From sspitsyn at openjdk.org Thu Jan 11 13:09:39 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 11 Jan 2024 13:09:39 GMT Subject: RFR: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static [v2] In-Reply-To: References: Message-ID: > The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. > The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. > The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. > One detail to underline is that he intrinsic implementation needs to use the argument #0 instead of #1. > > Testing: > - The mach5 tiers 1-6 show no regressions Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge - 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17298/files - new: https://git.openjdk.org/jdk/pull/17298/files/2b684607..66bcce95 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17298&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17298&range=00-01 Stats: 17642 lines in 345 files changed: 12363 ins; 3111 del; 2168 mod Patch: https://git.openjdk.org/jdk/pull/17298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17298/head:pull/17298 PR: https://git.openjdk.org/jdk/pull/17298 From kbarrett at openjdk.org Thu Jan 11 13:12:55 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 11 Jan 2024 13:12:55 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v2] In-Reply-To: References: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> <9ySX6s0Gi1jjMBCXmNaaZvkW-2bEnGkC6lpM0jL2sCM=.09a59d30-fb58-4a14-bac7-2640b7bfc5bb@github.com> Message-ID: On Tue, 26 Sep 2023 07:29:18 GMT, Julian Waters wrote: > > The change from `throw()` to `noexcept` seems reasonable though I assume we have to first approve this via the Hostpot style guide? > > There are places you are adding `noexcept` when there is no `throw()` and that seems inappropriate for this PR - unless they are required because of some transitive rule application? > > Thanks. > > Oops, those places where noexcept slipped through are from an experimental branch where I enabled exceptions for HotSpot, they got in by mistake, my apologies. I've moved this to draft for now, since Apple Clang seems to be having a hard time with the noexcept specifiers. Not sure about the Style Guide, maybe I should ask @kimbarrett? I've added https://bugs.openjdk.org/browse/JDK-8255082 as a blocker for this bug. As noted there, addressing that is not as simple as just saying "yes, we can use noexcept". And no, we're not going to enable exceptions in HotSpot. I'd be interested in hearing what problems you've run into with Apple clang wrto noexcept. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15910#issuecomment-1740158526 From dholmes at openjdk.org Thu Jan 11 13:12:54 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Jan 2024 13:12:54 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v2] In-Reply-To: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> References: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> Message-ID: <9ySX6s0Gi1jjMBCXmNaaZvkW-2bEnGkC6lpM0jL2sCM=.09a59d30-fb58-4a14-bac7-2640b7bfc5bb@github.com> On Tue, 26 Sep 2023 03:50:14 GMT, Julian Waters wrote: >> throw() has been deprecated since C++11 alongside dynamic exception specifications, we should replace all instances of it with noexcept to prepare HotSpot for later versions of C++ > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Nevermind, this looks better The change from `throw()` to `noexcept` seems reasonable though I assume we have to first approve this via the Hostpot style guide? There are places you are adding `noexcept` when there is no `throw()` and that seems inappropriate for this PR - unless they are required because of some transitive rule application? Thanks. I will ping @kimbarrett internally. ------------- PR Review: https://git.openjdk.org/jdk/pull/15910#pullrequestreview-1643532848 PR Comment: https://git.openjdk.org/jdk/pull/15910#issuecomment-1738360697 From jwaters at openjdk.org Thu Jan 11 13:12:55 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 13:12:55 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v2] In-Reply-To: <9ySX6s0Gi1jjMBCXmNaaZvkW-2bEnGkC6lpM0jL2sCM=.09a59d30-fb58-4a14-bac7-2640b7bfc5bb@github.com> References: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> <9ySX6s0Gi1jjMBCXmNaaZvkW-2bEnGkC6lpM0jL2sCM=.09a59d30-fb58-4a14-bac7-2640b7bfc5bb@github.com> Message-ID: On Tue, 26 Sep 2023 07:23:05 GMT, David Holmes wrote: > The change from `throw()` to `noexcept` seems reasonable though I assume we have to first approve this via the Hostpot style guide? > > There are places you are adding `noexcept` when there is no `throw()` and that seems inappropriate for this PR - unless they are required because of some transitive rule application? > > Thanks. Oops, those places where noexcept slipped through are from an experimental branch where I enabled exceptions for HotSpot, they got in by mistake, my apologies. I've moved this to draft for now, since Apple Clang seems to be having a hard time with the noexcept specifiers. Not sure about the Style Guide, maybe I should ask @kimbarrett? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15910#issuecomment-1734982536 From jwaters at openjdk.org Thu Jan 11 13:12:55 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 13:12:55 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v2] In-Reply-To: References: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> <9ySX6s0Gi1jjMBCXmNaaZvkW-2bEnGkC6lpM0jL2sCM=.09a59d30-fb58-4a14-bac7-2640b7bfc5bb@github.com> Message-ID: On Fri, 29 Sep 2023 00:32:11 GMT, Kim Barrett wrote: > > > The change from `throw()` to `noexcept` seems reasonable though I assume we have to first approve this via the Hostpot style guide? > > > There are places you are adding `noexcept` when there is no `throw()` and that seems inappropriate for this PR - unless they are required because of some transitive rule application? > > > Thanks. > > > > > > Oops, those places where noexcept slipped through are from an experimental branch where I enabled exceptions for HotSpot, they got in by mistake, my apologies. I've moved this to draft for now, since Apple Clang seems to be having a hard time with the noexcept specifiers. Not sure about the Style Guide, maybe I should ask @kimbarrett? > > I've added https://bugs.openjdk.org/browse/JDK-8255082 as a blocker for this bug. As noted there, addressing that is not as simple as just saying "yes, we can use noexcept". > > And no, we're not going to enable exceptions in HotSpot. > > I'd be interested in hearing what problems you've run into with Apple clang wrto noexcept. No worries, the branch with the Exceptions enabled HotSpot is for my own personal use. Apple Clang seems to have issues parsing noexcept in operator new, as seen in the tests for this PR: make[1]: *** [/Users/runner/work/jdk/jdk/make/Init.gmk:323: main] Error 2 === Output from failing command(s) repeated here === make: *** [/Users/runner/work/jdk/jdk/make/Init.gmk:[189](https://github.com/TheShermanTanker/jdk/actions/runs/6307800420/job/17125236356#step:8:191): product-bundles] Error 2 * For target hotspot_variant-server_tools_adlc_objs_adlArena.o: In file included from /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.cpp:25: In file included from /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlc.hpp:88: /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.hpp:43:34: error: expected ';' at end of declaration list void* operator new(size_t size) noexcept; ^ ; /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.hpp:52:34: error: expected ';' at end of declaration list void* operator new(size_t size) noexcept; ^ ; /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.hpp:66:49: error: expected ';' at end of declaration list void* operator new(size_t size, size_t length) noexcept; ^ ; /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.cpp:47:68: error: expected function body after function declarator ... (rest of output omitted) * For target hotspot_variant-server_tools_adlc_objs_adlparse.o: In file included from /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlparse.cpp:27: In file included from /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlc.hpp:88: /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.hpp:43:34: error: expected ';' at end of declaration list void* operator new(size_t size) noexcept; ^ ; /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.hpp:52:34: error: expected ';' at end of declaration list void* operator new(size_t size) noexcept; ^ ; /Users/runner/work/jdk/jdk/src/hotspot/share/adlc/adlArena.hpp:66:49: error: expected ';' at end of declaration list void* operator new(size_t size, size_t length) noexcept; ^ ; 3 errors generated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15910#issuecomment-1740210539 From jwaters at openjdk.org Thu Jan 11 13:12:53 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 13:12:53 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v3] In-Reply-To: References: Message-ID: > throw() has been deprecated since C++11 alongside dynamic exception specifications, we should replace all instances of it with noexcept to prepare HotSpot for later versions of C++ Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Typo in GensrcAdlc.gmk - Merge branch 'openjdk:master' into noexcept - Merge branch 'master' into noexcept - ic in compiledIC.hpp - Revert compiledIC.cpp - Revert compiledIC.hpp - Partially Revert parse.hpp - Merge branch 'master' into noexcept - Merge branch 'master' into noexcept - Nevermind, this looks better - ... and 2 more: https://git.openjdk.org/jdk/compare/e5aed6be...1fa42dca ------------- Changes: https://git.openjdk.org/jdk/pull/15910/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15910&range=02 Stats: 88 lines in 38 files changed: 0 ins; 0 del; 88 mod Patch: https://git.openjdk.org/jdk/pull/15910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15910/head:pull/15910 PR: https://git.openjdk.org/jdk/pull/15910 From jwaters at openjdk.org Thu Jan 11 13:12:56 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 13:12:56 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v2] In-Reply-To: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> References: <5ctCpwKXcy9ywwvThRNzl6s_Bn7rHWMFtXdmqWbjq50=.eedf46de-165a-4e7e-b2d2-dcf5ce5d153a@github.com> Message-ID: On Tue, 26 Sep 2023 03:50:14 GMT, Julian Waters wrote: >> throw() has been deprecated since C++11 alongside dynamic exception specifications, we should replace all instances of it with noexcept to prepare HotSpot for later versions of C++ > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Nevermind, this looks better Something seems to be wrong with Apple Clang here, hmm... ------------- PR Comment: https://git.openjdk.org/jdk/pull/15910#issuecomment-1837438470 From jkern at openjdk.org Thu Jan 11 13:15:36 2024 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 11 Jan 2024 13:15:36 GMT Subject: Integrated: JDK-8320890: [AIX] Find a better way to mimic dl handle equality In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 11:33:46 GMT, Joachim Kern wrote: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. This pull request has now been integrated. Changeset: b8ae4a8c Author: Joachim Kern Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/b8ae4a8c0985d1763ac48ba78943d8b992d7be77 Stats: 446 lines in 12 files changed: 332 ins; 108 del; 6 mod 8320890: [AIX] Find a better way to mimic dl handle equality Reviewed-by: stuefe, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/16920 From jwaters at openjdk.org Thu Jan 11 13:23:45 2024 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 11 Jan 2024 13:23:45 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: > Compile the JDK as C++17, enabling the use of all C++17 language features Julian Waters has updated the pull request incrementally with one additional commit since the last revision: Require clang 13 in toolchain.m4 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14988/files - new: https://git.openjdk.org/jdk/pull/14988/files/4f196292..50ffeeea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14988.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14988/head:pull/14988 PR: https://git.openjdk.org/jdk/pull/14988 From roland at openjdk.org Thu Jan 11 13:46:32 2024 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 11 Jan 2024 13:46:32 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: References: Message-ID: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> On Thu, 4 Jan 2024 12:52:52 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Complications with ttyl** >> There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. >> >> If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > fixed typo src/hotspot/share/ci/ciMethodData.cpp line 96: > 94: // a safepoint. We temporarily release the lock and allow > 95: // safepoints, and revert that at the end of the scope: > 96: MutexUnlocker mu(_mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); Why is this safe? src/hotspot/share/ci/ciMethodData.cpp line 135: > 133: > 134: // Lock to read ProfileData, and ensure lock is not unintentionally broken by a safepoint > 135: MutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); Is there anyway to have MutexLocker take care of verifying that the there's no safepoint? It would be nice to replace: MutexLocker ml(); NoSafepointVerifier no_safepoint; by: MutexLocker ml(); only. src/hotspot/share/runtime/deoptimization.cpp line 2406: > 2404: reprofile = true; > 2405: } > 2406: Why is it safe to move this here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1448886184 PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1448883447 PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1448873612 From aboldtch at openjdk.org Thu Jan 11 13:56:23 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Jan 2024 13:56:23 GMT Subject: RFR: 8322957: Generational ZGC: Relocation selection must join the STS [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 10:03:44 GMT, Stefan Karlsson wrote: >> The concurrent ZGC threads don't automatically participate in the safepoint protocol, which means that they can run concurrently with safepoint VM Operations. Instead they use other means to hook into the safepoint protocol whenever they need to make changes that could be racing with the various VM Operations. The most common way is to join the "suspendible thread set". For details around this see `SafepointSynchronize::begin` and the call to `Universe::heap()->safepoint_synchronize_begin()`. >> >> It turns out that the relocation selection phase was updated to use a call oop_iterate, to modify oops of some of the objects. This was done without having the GC threads join the suspendible thread set. This means that various VM Operations could run concurrently with the oop_iterate. This caused the failure described in JDK-8322957: The JFR Leak Profiler modified the object header bits, while the GC's oop_iterate function used the same bits to determine if the oop iteration over an object should be skipped. This lead to objects not being modified as they were supposed to, which lead to broken oops and asserts. >> >> The fix is quite small and could be limited to the lines added to [src/hotspot/share/gc/z/zRelocationSet.cpp](https://github.com/openjdk/jdk/compare/master...stefank:jdk:8322957_sts_with_relocation_selection?expand=1#diff-883b7a72f757c1c5331769ad4a5c763335d0267ee33a0bc06896fa16d89ea58f). However, to lower the risk of reintroducing a bug like this again, we've added extra verification code. Some of the infrastructure to get the correct verification is placed outside of the GC code, and that's why this PR is sent to the hotspot-dev list. >> >> This has been tested with the reproducer of the original bug + tier1-7 on linux-x64-debug. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix release builds lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17368#pullrequestreview-1815640287 From mdoerr at openjdk.org Thu Jan 11 14:31:28 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 11 Jan 2024 14:31:28 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 Thanks! We may switch to clang 14 on MacOS at some point of time, but it's better to have that disentangled. Some people build JDK 11 and 23 on the same machine and that is easier if they don't have to switch Xcode. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14988#pullrequestreview-1815729317 From fparain at openjdk.org Thu Jan 11 15:45:26 2024 From: fparain at openjdk.org (Frederic Parain) Date: Thu, 11 Jan 2024 15:45:26 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field [v3] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 22:34:57 GMT, Alex Menkov wrote: >> FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). >> The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. >> It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. >> >> FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) >> >> Testing: >> - tier1..3 >> - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic >> including >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; >> - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > indent LGTM ------------- Marked as reviewed by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17094#pullrequestreview-1815899711 From duke at openjdk.org Thu Jan 11 15:46:45 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Thu, 11 Jan 2024 15:46:45 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: restore and rename 'GenCollectedHeap' to 'SerialHeap' ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/628de1a9..ae2817de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=10-11 Stats: 64 lines in 2 files changed: 61 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From dchuyko at openjdk.org Thu Jan 11 15:47:39 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 11 Jan 2024 15:47:39 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v21] In-Reply-To: References: Message-ID: <80-36XIRkWYvc-zxRxsYm-Puip2-Y2Cf6be206LgZpo=.9f7e32ee-f32e-4db5-b964-c400d5470139@github.com> > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 39 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 29 more: https://git.openjdk.org/jdk/compare/c2e77e2f...d1aec993 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=20 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From epeter at openjdk.org Thu Jan 11 15:52:30 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 11 Jan 2024 15:52:30 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: <-_GYbSumFoJE1AkvI3bfWLSvJ2wy91g7ce30pG2EL6k=.9a15aa73-7186-48be-8f7b-47612029a0ee@github.com> On Thu, 11 Jan 2024 13:33:29 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed typo > > src/hotspot/share/runtime/deoptimization.cpp line 2406: > >> 2404: reprofile = true; >> 2405: } >> 2406: > > Why is it safe to move this here? Do you think it is not safe? We have exactly the same conditions: `make_not_entrant` and `pdata != nullptr` (the second can only hold if we take the current path which gets `pdata`, else it would be `nullptr` anyway). The only question is with if (!nm->make_not_entrant()) { return; // the call did not change nmethod's state } Hmm, maybe I need to copy that here too.But that's kinda ugly. Do you think that is necessary? Any other suggestion? The idea was to limit the scope of `pdata`. And I kinda have to do that because only in this path do I have a guarantee that `trap_mdo != nullptr`. Otherwise I need to somehow make the lock conditional in the outer scope, that is even more nasty. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1449058393 From epeter at openjdk.org Thu Jan 11 16:02:28 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 11 Jan 2024 16:02:28 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Thu, 11 Jan 2024 13:43:25 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed typo > > src/hotspot/share/ci/ciMethodData.cpp line 96: > >> 94: // a safepoint. We temporarily release the lock and allow >> 95: // safepoints, and revert that at the end of the scope: >> 96: MutexUnlocker mu(_mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); > > Why is this safe? We only do this in the `finish` method, where we hold no reference to any profiled-data anymore. We only really need to hold the lock during `clean_extra_data` and `is_live`. But after those are done, we can quickly release the lock so that we can call `get_method`. Does that make sense to you? I'm not super happy with the general pattern here... I basically kept the old pattern. I wonder, maybe there is a way to move the scope of the lock, such that we only need to lock inside of `clean_extra_data`, and do not hold it before we enter `clean_extra_data`. Do you think that would be preferable? > src/hotspot/share/ci/ciMethodData.cpp line 135: > >> 133: >> 134: // Lock to read ProfileData, and ensure lock is not unintentionally broken by a safepoint >> 135: MutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); > > Is there anyway to have MutexLocker take care of verifying that the there's no safepoint? It would be nice to replace: > > > MutexLocker ml(); > NoSafepointVerifier no_safepoint; > > > by: > > > MutexLocker ml(); > > > only. @fisk @tkrodriguez what do you suggest for that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1449071256 PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1449072373 From kvn at openjdk.org Thu Jan 11 18:31:22 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 11 Jan 2024 18:31:22 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing [v2] In-Reply-To: References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Message-ID: <1MXO04LDPZbW9-1PJmIo4KRhMkA81UX-dmFtiCoQlNA=.a30da13c-2d8b-453a-b0f1-41f328f9e8bf@github.com> On Thu, 11 Jan 2024 08:56:47 GMT, Aleksey Shipilev wrote: >> Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. It should be useful to catch compiler bugs early, before running `hotspot:all`. >> >> These tests take quite a bit of time (~15 mins on my M1), so I opted to add them to relevant "slow" group that is run in tier3. >> >> Additional testing: >> - [x] Checked that `tier3_compiler` runs CTW tests > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Adding directly to tier3_compiler Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17348#pullrequestreview-1816256599 From kvn at openjdk.org Thu Jan 11 18:32:21 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 11 Jan 2024 18:32:21 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 09:17:35 GMT, Aleksey Shipilev wrote: >> We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. >> >> There are a few interesting conversions along the way: >> 1. `intptr_t` -> `uint32_t` (this method) >> 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) >> 3. `int32_t` -> `uint32_t` (in `emit_int32`) >> >> I believe these are safe after `is_uimm32` check, but please check (sic) me on this. >> >> Note that x86_64 matcher already does similar thing for immediates: >> >> >> // Long Immediate 32-bit unsigned >> operand immUL32() >> %{ >> predicate(n->get_long() == (unsigned int) (n->get_long())); >> match(ConL); >> ... >> %} >> >> instruct loadConUL32(rRegL dst, immUL32 src) >> %{ >> ... >> format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} >> ins_encode %{ >> __ movl($dst$$Register, $src$$constant); >> %} >> %} >> >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> >> Code sizes for `Hello World`, `-Xcomp`: >> >> >> # Before >> tier1 nmethod code size : 426208 bytes >> tier2 nmethod code size : 462880 bytes >> tier3 nmethod code size : 889992 bytes >> tier4 nmethod code size : 1244448 bytes >> >> # After >> tier1 nmethod code size : 425768 bytes (-0.1%) >> tier2 nmethod code size : 462400 bytes (-0.1%) >> tier3 nmethod code size : 882072 bytes (-0.8%) >> tier4 nmethod code size : 1236448 bytes (-0.6%) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Just do checked_cast" > > This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. Okay. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17343#issuecomment-1887735098 From cjplummer at openjdk.org Thu Jan 11 18:35:35 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 11 Jan 2024 18:35:35 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 15:46:45 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > restore and rename 'GenCollectedHeap' to 'SerialHeap' Changes requested by cjplummer (Reviewer). src/hotspot/share/gc/shared/vmStructs_gc.hpp line 31: > 29: #include "gc/shared/cardTable.hpp" > 30: #include "gc/shared/collectedHeap.hpp" > 31: #include "gc/shared/genCollectedHeap.hpp" vmstructs purpose is to support SA. Thus renaming should be done here instead of deletion. You need to restore this line and rename genCollectedHeap.hpp -> serialHeap.hpp. src/hotspot/share/gc/shared/vmStructs_gc.hpp line 114: > 112: nonstatic_field(GenCollectedHeap, _young_gen, Generation*) \ > 113: nonstatic_field(GenCollectedHeap, _old_gen, Generation*) \ > 114: \ You need to restore these lines and rename GenCollectedHeap -> SerialHeap. src/hotspot/share/gc/shared/vmStructs_gc.hpp line 149: > 147: \ > 148: declare_toplevel_type(CollectedHeap) \ > 149: declare_type(GenCollectedHeap, CollectedHeap) \ You need to restore this line and rename GenCollectedHeap -> SerialHeap. src/hotspot/share/gc/shared/vmStructs_gc.hpp line 180: > 178: declare_toplevel_type(DefNewGeneration*) \ > 179: declare_toplevel_type(GenCollectedHeap*) \ > 180: declare_toplevel_type(Generation*) \ You need to restore these lines and rename GenCollectedHeap -> SerialHeap. I'm not sure why DefNewGeneration and Generation were deleted. I assume they are still present. ------------- PR Review: https://git.openjdk.org/jdk/pull/16927#pullrequestreview-1816257350 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449251031 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449248757 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449249061 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449249834 From kvn at openjdk.org Thu Jan 11 19:34:22 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 11 Jan 2024 19:34:22 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 09:17:35 GMT, Aleksey Shipilev wrote: >> We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. >> >> There are a few interesting conversions along the way: >> 1. `intptr_t` -> `uint32_t` (this method) >> 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) >> 3. `int32_t` -> `uint32_t` (in `emit_int32`) >> >> I believe these are safe after `is_uimm32` check, but please check (sic) me on this. >> >> Note that x86_64 matcher already does similar thing for immediates: >> >> >> // Long Immediate 32-bit unsigned >> operand immUL32() >> %{ >> predicate(n->get_long() == (unsigned int) (n->get_long())); >> match(ConL); >> ... >> %} >> >> instruct loadConUL32(rRegL dst, immUL32 src) >> %{ >> ... >> format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} >> ins_encode %{ >> __ movl($dst$$Register, $src$$constant); >> %} >> %} >> >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> >> Code sizes for `Hello World`, `-Xcomp`: >> >> >> # Before >> tier1 nmethod code size : 426208 bytes >> tier2 nmethod code size : 462880 bytes >> tier3 nmethod code size : 889992 bytes >> tier4 nmethod code size : 1244448 bytes >> >> # After >> tier1 nmethod code size : 425768 bytes (-0.1%) >> tier2 nmethod code size : 462400 bytes (-0.1%) >> tier3 nmethod code size : 882072 bytes (-0.8%) >> tier4 nmethod code size : 1236448 bytes (-0.6%) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Just do checked_cast" > > This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17343#pullrequestreview-1816430142 From rkennke at openjdk.org Thu Jan 11 19:46:28 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 11 Jan 2024 19:46:28 GMT Subject: RFR: JDK-8314890: Reduce number of loads for Klass decoding in static code [v12] In-Reply-To: References: Message-ID: <8AeUTRxo3TXnyHL098Hif37yZEuq9uX-MhGufePs5r4=.d888119c-02f0-4558-96a2-780e6d9eeddf@github.com> On Wed, 15 Nov 2023 14:50:48 GMT, Thomas Stuefe wrote: >> Small change that reduces the number of loads generated by the C++ compiler for a narrow Klass decoding operation (`CompressedKlassPointers::decode_xxx()`. >> >> Stock: three loads (with two probably sharing a cache line) - UseCompressedClassPointers, encoding base and shift. >> >> >> 8b7b62: 48 8d 05 7f 1b c3 00 lea 0xc31b7f(%rip),%rax # 14e96e8 >> 8b7b69: 0f b6 00 movzbl (%rax),%eax >> 8b7b6c: 84 c0 test %al,%al >> 8b7b6e: 0f 84 9c 00 00 00 je 8b7c10 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> >> 8b7b74: 48 8d 15 05 62 c6 00 lea 0xc66205(%rip),%rdx # 151dd80 <_ZN23CompressedKlassPointers6_shiftE> >> 8b7b7b: 8b 7b 08 mov 0x8(%rbx),%edi >> 8b7b7e: 8b 0a mov (%rdx),%ecx >> 8b7b80: 48 8d 15 01 62 c6 00 lea 0xc66201(%rip),%rdx # 151dd88 <_ZN23CompressedKlassPointers5_baseE> >> 8b7b87: 48 d3 e7 shl %cl,%rdi >> 8b7b8a: 48 03 3a add (%rdx),%rdi >> >> >> Patched: one load loads all three. Since shift occupies the lowest 8 bits, compiled code uses 8bit register; ditto the UseCompressedOops flag. >> >> >> 8ba302: 48 8d 05 97 9c c2 00 lea 0xc29c97(%rip),%rax # 14e3fa0 <_ZN23CompressedKlassPointers6_comboE> >> 8ba309: 48 8b 08 mov (%rax),%rcx >> 8ba30c: f6 c5 01 test $0x1,%ch # use compressed klass pointers? >> 8ba30f: 0f 84 9b 00 00 00 je 8ba3b0 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> >> 8ba315: 8b 7b 08 mov 0x8(%rbx),%edi >> 8ba318: 48 d3 e7 shl %cl,%rdi # shift >> 8ba31b: 66 31 c9 xor %cx,%cx # zero out lower 16 bits of base >> 8ba31e: 48 01 cf add %rcx,%rdi # add base >> 8ba321: 8b 4f 08 mov 0x8(%rdi),%ecx >> >> --- >> >> Performance measurements: >> >> G1, doing a full GC over a heap filled with 256 mio life j.l.Object instances. >> >> I see a reduction of Full Pause times between 1.2% and 5%. I am unsure how reliable these numbers are since, despite my efforts (running tests on isolated CPUs etc.), the standard deviation was quite high at ?4%. Still, in general, numbers seemed to go down rather than up. >> >> --- >> >> Future extensions: >> >> This patch uses the fact that the encoding base is aligned to metaspace reser... > > Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/compressedKlass.hpp > > Co-authored-by: Aleksey Shipil?v Looks good to me! Thank you! (And regarding the bitfield discussion - I *would* like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15389#pullrequestreview-1816460435 From cjplummer at openjdk.org Thu Jan 11 19:58:42 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 11 Jan 2024 19:58:42 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 18:31:49 GMT, Chris Plummer wrote: >> Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: >> >> restore and rename 'GenCollectedHeap' to 'SerialHeap' > > src/hotspot/share/gc/shared/vmStructs_gc.hpp line 31: > >> 29: #include "gc/shared/cardTable.hpp" >> 30: #include "gc/shared/collectedHeap.hpp" >> 31: #include "gc/shared/genCollectedHeap.hpp" > > vmstructs purpose is to support SA. Thus renaming should be done here instead of deletion. You need to restore this line and rename genCollectedHeap.hpp -> serialHeap.hpp. It looks like the include of serialHeap.hpp is already taken care of by vmStructs_serial.hpp, so deleting this line looks like it is ok. > src/hotspot/share/gc/shared/vmStructs_gc.hpp line 114: > >> 112: nonstatic_field(GenCollectedHeap, _young_gen, Generation*) \ >> 113: nonstatic_field(GenCollectedHeap, _old_gen, Generation*) \ >> 114: \ > > You need to restore these lines and rename GenCollectedHeap -> SerialHeap. Nevermind. I see now that this moved to vmStructs_serial.hpp > src/hotspot/share/gc/shared/vmStructs_gc.hpp line 149: > >> 147: \ >> 148: declare_toplevel_type(CollectedHeap) \ >> 149: declare_type(GenCollectedHeap, CollectedHeap) \ > > You need to restore this line and rename GenCollectedHeap -> SerialHeap. Nevermind. I see now that this moved to vmStructs_serial.hpp ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449324632 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449318718 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449319113 From cjplummer at openjdk.org Thu Jan 11 21:17:35 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 11 Jan 2024 21:17:35 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 18:30:33 GMT, Chris Plummer wrote: >> Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: >> >> restore and rename 'GenCollectedHeap' to 'SerialHeap' > > src/hotspot/share/gc/shared/vmStructs_gc.hpp line 180: > >> 178: declare_toplevel_type(DefNewGeneration*) \ >> 179: declare_toplevel_type(GenCollectedHeap*) \ >> 180: declare_toplevel_type(Generation*) \ > > You need to restore these lines and rename GenCollectedHeap -> SerialHeap. I'm not sure why DefNewGeneration and Generation were deleted. I assume they are still present. It looks like all 3 of these already exist elsewhere, thus their removal is ok. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1449402502 From cjplummer at openjdk.org Thu Jan 11 21:28:32 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 11 Jan 2024 21:28:32 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 15:46:45 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > restore and rename 'GenCollectedHeap' to 'SerialHeap' Sorry about the false alarms. Not noticing that vmStructs_serial.hpp was also updated threw me off. The changes look fine. I ran the SA tests locally and they passed. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16927#pullrequestreview-1816739658 From sgibbons at openjdk.org Thu Jan 11 23:06:32 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Jan 2024 23:06:32 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - Merge branch 'openjdk:master' into indexof - Merge branch 'openjdk:master' into indexof - Addressing review comments. - Fix for JDK-8321599 - Support UU IndexOf - Only use optimization when EnableX86ECoreOpts is true - Fix whitespace - Merge branch 'openjdk:master' into indexof - Comments; added exhaustive-ish test - Subtracting 0x10 twice. - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 ------------- Changes: https://git.openjdk.org/jdk/pull/16753/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=06 Stats: 3060 lines in 14 files changed: 2918 ins; 7 del; 135 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From sspitsyn at openjdk.org Fri Jan 12 00:54:59 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Jan 2024 00:54:59 GMT Subject: [jdk22] RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf Message-ID: Hi all, This pull request contains a clean backport of commit [2806adee](https://github.com/openjdk/jdk/commit/2806adee2d8cca6bc215f285888631799bd02eac) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Serguei Spitsyn on 10 Jan 2024 and was reviewed by Alex Menkov and Chris Plummer. Thanks! ------------- Commit messages: - Backport 2806adee2d8cca6bc215f285888631799bd02eac Changes: https://git.openjdk.org/jdk22/pull/64/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=64&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321685 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk22/pull/64.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/64/head:pull/64 PR: https://git.openjdk.org/jdk22/pull/64 From sspitsyn at openjdk.org Fri Jan 12 01:03:19 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Jan 2024 01:03:19 GMT Subject: [jdk22] RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI Message-ID: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> Hi all, This pull request contains a clean backport of commit [aff659aa](https://github.com/openjdk/jdk/commit/aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Serguei Spitsyn on 21 Dec 2023 and was reviewed by David Holmes and Alan Bateman. Thanks! ------------- Commit messages: - Backport aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a Changes: https://git.openjdk.org/jdk22/pull/65/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=65&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322538 Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/65.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/65/head:pull/65 PR: https://git.openjdk.org/jdk22/pull/65 From ysr at openjdk.org Fri Jan 12 01:40:22 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 12 Jan 2024 01:40:22 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. Marked as reviewed by ysr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17326#pullrequestreview-1817183765 From amenkov at openjdk.org Fri Jan 12 02:41:18 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 12 Jan 2024 02:41:18 GMT Subject: [jdk22] RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: <_bJ4O96kz_0gwSbLUYy83QETN6S7XxZo26A11-BlPIE=.57aa9185-637b-46ea-951a-585a1a3b21fd@github.com> On Fri, 12 Jan 2024 00:47:20 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a clean backport of commit [2806adee](https://github.com/openjdk/jdk/commit/2806adee2d8cca6bc215f285888631799bd02eac) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 10 Jan 2024 and was reviewed by Alex Menkov and Chris Plummer. > > Thanks! Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/64#pullrequestreview-1817273736 From duke at openjdk.org Fri Jan 12 05:17:30 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 12 Jan 2024 05:17:30 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 21:25:31 GMT, Chris Plummer wrote: > Sorry about the false alarms. Not noticing that vmStructs_serial.hpp was also updated threw me off. The changes look fine. I ran the SA tests locally and they passed. It's alright. btw, thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1888449188 From cjplummer at openjdk.org Fri Jan 12 05:21:32 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 12 Jan 2024 05:21:32 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 05:14:54 GMT, Lei Zaakjyu wrote: > It's alright. btw, thanks for the review. You're welcome. And just to be clear, I'm okaying just the SA changes. You'll still need another reviewer for the hotspot changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1888451834 From tanksherman27 at gmail.com Fri Jan 12 05:24:09 2024 From: tanksherman27 at gmail.com (Julian Waters) Date: Fri, 12 Jan 2024 13:24:09 +0800 Subject: static_cast(0) vs do while Message-ID: Hi all, In my personal fork of HotSpot I have the following commit https://github.com/TheShermanTanker/jdk/commit/54131b70d40a88ab4176d23821f4c32044c0043d which replaces some occurrences of do while with a discarding static_cast (static_cast(0) is a nop), to avoid inefficiencies in the compiled code for debug mode, when optimizations are turned off (Namely a compare and jump back to the start of the loop on a condition that is always false). Is this minor optimization worth committing upstream to HotSpot? It should also mean that the compiled code is clearer when the methods containing asserts are disassembled. best regards, Julian From dholmes at openjdk.org Fri Jan 12 05:49:18 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 12 Jan 2024 05:49:18 GMT Subject: [jdk22] RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> References: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> Message-ID: On Fri, 12 Jan 2024 00:53:47 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a clean backport of commit [aff659aa](https://github.com/openjdk/jdk/commit/aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 21 Dec 2023 and was reviewed by David Holmes and Alan Bateman. > > Thanks! Looks good. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk22/pull/65#pullrequestreview-1817439971 From duke at openjdk.org Fri Jan 12 06:32:27 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 06:32:27 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Thu, 11 Jan 2024 07:53:14 GMT, Thomas Stuefe wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Use pthread instead > > src/hotspot/share/runtime/os.cpp line 2118: > >> 2116: } >> 2117: } >> 2118: > > I suggest a slightly different flow, similar to how we do things in other areas: > > os.hpp > > > private: > bool pd_pretouch_memory(..); > public: > void pretouch_memory(..); > > > os.cpp > > void os::pretouch_memory(..) { > // Ask platform first > if (pd_pretouch_memory(..)) { > return; > } > ... do pretouching by touching > } > > > then provide a pd_pretouch for every platform; let other platforms be just a noop returning false, on Linux - if THPs are enabled and so forth, do the madvise and return true. > > One function less in the os namespace, and we don't call back from a pd_... function into a generic function which is unusual. Linux requires a different page size for touching when THP is enabled. Should I change page_size to a reference or returns the page size for touching rather than a false? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1449901288 From kbarrett at openjdk.org Fri Jan 12 06:35:25 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 12 Jan 2024 06:35:25 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 14:28:20 GMT, Martin Doerr wrote: > Thanks! We may switch to clang 14 on MacOS at some point of time, but it's better to have that disentangled. Some people build JDK 11 and 23 on the same machine and that is easier if they don't have to switch Xcode. I think the minimum clang version should not be greater than what?s provided by the minimum Open XL C/C++ version. If the aix-ppc port only requires Open XL C/C++ 17.1.1 then that?s clang 13. If the aix-ppc port were to instead jump further forward, to 17.1.2, then that?s clang 15. I've asked the aix-ppc folks if requiring 17.1.2 would be okay, but haven't heard back yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1888507915 From kim.barrett at oracle.com Fri Jan 12 06:50:01 2024 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 12 Jan 2024 06:50:01 +0000 Subject: static_cast(0) vs do while In-Reply-To: References: Message-ID: > On Jan 12, 2024, at 12:24 AM, Julian Waters wrote: > > Hi all, > > In my personal fork of HotSpot I have the following commit > https://github.com/TheShermanTanker/jdk/commit/54131b70d40a88ab4176d23821f4c32044c0043d > which replaces some occurrences of do while with a discarding > static_cast (static_cast(0) is a nop), to avoid inefficiencies > in the compiled code for debug mode, when optimizations are turned off > (Namely a compare and jump back to the start of the loop on a > condition that is always false). Is this minor optimization worth > committing upstream to HotSpot? It should also mean that the compiled > code is clearer when the methods containing asserts are disassembled. No, I don't think we should do this. `do { ... } while (false)` is a well-known idiom. Uglifying things this way does not seem helpful. The proposed change would also make assert and friends expand to multiple statements instead of one. I've no idea what the impact of that might be, but suspect there are cases where it could change the meaning of code (possibly into a compile-time error, or worse, perhaps even silently). At the very least it would violate expectations. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From duke at openjdk.org Fri Jan 12 07:24:28 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 07:24:28 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: <_BwiPEQGivAMBltfX2w0QT51j62KH3uFc2mJMswHTJQ=.3b07eabe-c933-4fb3-9409-4b0fcb6e0e14@github.com> References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> <_BwiPEQGivAMBltfX2w0QT51j62KH3uFc2mJMswHTJQ=.3b07eabe-c933-4fb3-9409-4b0fcb6e0e14@github.com> Message-ID: On Thu, 11 Jan 2024 08:32:28 GMT, Thomas Stuefe wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Use pthread instead > > test/hotspot/jtreg/runtime/os/TestTransparentHugePageUsage.java line 96: > >> 94: .map(e -> Long.valueOf(e.getKey().substring(e.getValue().start(1), e.getValue().end(1)))); >> 95: if (!usage.isPresent()) throw new RuntimeException("The usage of THP was not found."); >> 96: if (usage.get() == 0) throw new RuntimeException("The usage of THP should not be zero."); > > The effect we would see without your patch would be small pages that are then converted to huge pages by khugepaged in its own time, right? So, maybe test that AnonHugePages == Size for the heap VMA? Unluckily, the size of AnonHugePages is still smaller than the heap size. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1449961373 From mbaesken at openjdk.org Fri Jan 12 07:52:25 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 12 Jan 2024 07:52:25 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 06:32:34 GMT, Kim Barrett wrote: > > Thanks! We may switch to clang 14 on MacOS at some point of time, but it's better to have that disentangled. Some people build JDK 11 and 23 on the same machine and that is easier if they don't have to switch Xcode. > > I think the minimum clang version should not be greater than what?s provided by the minimum Open XL C/C++ version. > > If the aix-ppc port only requires Open XL C/C++ 17.1.1 then that?s clang 13. If the aix-ppc port were to instead jump further forward, to 17.1.2, then that?s clang 15. > > I've asked the aix-ppc folks if requiring 17.1.2 would be okay, but haven't heard back yet. We at SAP use and document xlC 17.1.1.4 for jdk22 (use the same for jdk23) https://wiki.openjdk.org/display/Build/Supported+Build+Platforms version 17.1.1.4 is already clang15 (at least that's what the compiler output is telling me) /opt/IBM/openxlC/17.1.1/bin/ibm-clang++_r -v IBM Open XL C/C++ for AIX 17.1.1 (5725-C72, 5765-J18), version 17.1.1.4, clang version 15.0.0 (build ca7115e) Target: powerpc-ibm-aix7.2.0.0 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1888583559 From duke at openjdk.org Fri Jan 12 08:20:03 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 08:20:03 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v22] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 29 additional commits since the last revision: - Use serial GC for the test - Use pthread instead - Remove the deletion - Fix type errors - Try to add a thread to use memory - Replace to char* when type casting - Fix the typo - Use char* instead - Fix the function arguments - Try to add a testcase to cover concurrent pretouch - ... and 19 more: https://git.openjdk.org/jdk/compare/01f5a923...6447a70e ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/f974a393..6447a70e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=20-21 Stats: 29318 lines in 1043 files changed: 19415 ins; 5250 del; 4653 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From duke at openjdk.org Fri Jan 12 08:20:03 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 08:20:03 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v22] In-Reply-To: References: Message-ID: On Fri, 20 Oct 2023 05:41:35 GMT, Liming Liu wrote: >> src/hotspot/os/linux/os_linux.cpp line 2911: >> >>> 2909: if (::madvise(first, len, MADV_POPULATE_WRITE) == -1) { >>> 2910: int err = errno; >>> 2911: if (err == EINVAL) { // Not supported >> >> Would be nice to avoid repeated syscalls to madvise if this fails once; no reason to try again, then. > > I tested the performance of this patch on kernel 4.18 and 5.13, and found the repeat calls have no impact. So I would not change anything about this. It seems that a thread-local variable is cheap for recording whether madvise once failed, but os::Linux does not have a thread-local variable yet. Is it okay to have a thread-local variable for this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1450017874 From duke at openjdk.org Fri Jan 12 08:56:29 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 08:56:29 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v23] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Add an option to be able not to use MADV_POPULATE_WRITE ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/6447a70e..89ae16bb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=21-22 Stats: 17 lines in 2 files changed: 4 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From duke at openjdk.org Fri Jan 12 08:58:49 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 08:58:49 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v24] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with two additional commits since the last revision: - untabify - Correct the fallback case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/89ae16bb..3469c0cc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=22-23 Stats: 7 lines in 1 file changed: 2 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From stefank at openjdk.org Fri Jan 12 09:00:28 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Jan 2024 09:00:28 GMT Subject: RFR: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. Thanks for all the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17326#issuecomment-1888686373 From stefank at openjdk.org Fri Jan 12 09:00:29 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Jan 2024 09:00:29 GMT Subject: Integrated: 8323297: Fix incorrect placement of precompiled.hpp include lines In-Reply-To: References: Message-ID: On Tue, 9 Jan 2024 14:55:45 GMT, Stefan Karlsson wrote: > There are a few files that have include lines before the precompiled.hpp include line. I propose that we fix this. > > Testing: I'll let this run through GHA and Oracle's tier1 to see that this still compiles. This pull request has now been integrated. Changeset: 7c3a39f4 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/7c3a39f400d97a443be146d928f85aa850d3b5cb Stats: 16 lines in 5 files changed: 7 ins; 9 del; 0 mod 8323297: Fix incorrect placement of precompiled.hpp include lines Reviewed-by: kbarrett, dholmes, shade, ysr ------------- PR: https://git.openjdk.org/jdk/pull/17326 From duke at openjdk.org Fri Jan 12 09:03:48 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 09:03:48 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v25] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Reduce a return ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/3469c0cc..c66f2dfd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=23-24 Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From ayang at openjdk.org Fri Jan 12 09:10:41 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Jan 2024 09:10:41 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 15:46:45 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > restore and rename 'GenCollectedHeap' to 'SerialHeap' Tier1-6 pass. I used `git diff --color-moved=dimmed_zebra ` to check the change inside hotspot and it's mostly code-moving, as expected. There are some still format issues (header include order and broken indentation), which can probably be fixed in followup PRs. ------------- Marked as reviewed by ayang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16927#pullrequestreview-1817728332 From duke at openjdk.org Fri Jan 12 09:19:27 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 09:19:27 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Thu, 11 Jan 2024 08:17:55 GMT, Thomas Stuefe wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Use pthread instead > > test/hotspot/jtreg/runtime/os/TestTransparentHugePageUsage.java line 46: > >> 44: import java.util.regex.Matcher; >> 45: import java.util.regex.Pattern; >> 46: import jdk.test.lib.process.ProcessTools; > > Please add a comment describing what the test does. E.g. "Tests checks that a pretouched java heap appears to use THPs by checking AnonHugePages in smaps". Feel free to find a better formulation. > > Does the test fail without madvise, at least sporadically? > > So I wonder whether it would be better to run with SerialGC, to limit the pretouching to one thread. That would increase the time window needed for pretouching and give us a higher chance to observe those small pages that appear before khugepaged gets around merging them into THPs. It almost certainly fails with 512MB huge pages and without madvise when vm_page_size is 64KB, but does not likely fail with 2MB huge pages. I would change the check of usage from 0 to 524288 (512MB in KB), and then it would fail with 2MB huge pages. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1450105546 From stefank at openjdk.org Fri Jan 12 09:26:19 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Jan 2024 09:26:19 GMT Subject: RFR: 8322957: Generational ZGC: Relocation selection must join the STS [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 10:03:44 GMT, Stefan Karlsson wrote: >> The concurrent ZGC threads don't automatically participate in the safepoint protocol, which means that they can run concurrently with safepoint VM Operations. Instead they use other means to hook into the safepoint protocol whenever they need to make changes that could be racing with the various VM Operations. The most common way is to join the "suspendible thread set". For details around this see `SafepointSynchronize::begin` and the call to `Universe::heap()->safepoint_synchronize_begin()`. >> >> It turns out that the relocation selection phase was updated to use a call oop_iterate, to modify oops of some of the objects. This was done without having the GC threads join the suspendible thread set. This means that various VM Operations could run concurrently with the oop_iterate. This caused the failure described in JDK-8322957: The JFR Leak Profiler modified the object header bits, while the GC's oop_iterate function used the same bits to determine if the oop iteration over an object should be skipped. This lead to objects not being modified as they were supposed to, which lead to broken oops and asserts. >> >> The fix is quite small and could be limited to the lines added to [src/hotspot/share/gc/z/zRelocationSet.cpp](https://github.com/openjdk/jdk/compare/master...stefank:jdk:8322957_sts_with_relocation_selection?expand=1#diff-883b7a72f757c1c5331769ad4a5c763335d0267ee33a0bc06896fa16d89ea58f). However, to lower the risk of reintroducing a bug like this again, we've added extra verification code. Some of the infrastructure to get the correct verification is placed outside of the GC code, and that's why this PR is sent to the hotspot-dev list. >> >> This has been tested with the reproducer of the original bug + tier1-7 on linux-x64-debug. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix release builds Tier1-7 passes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17368#issuecomment-1888729354 From kbarrett at openjdk.org Fri Jan 12 09:27:22 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 12 Jan 2024 09:27:22 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 07:49:17 GMT, Matthias Baesken wrote: > We at SAP use and document xlC 17.1.1.4 for jdk22 (use the same for jdk23) https://wiki.openjdk.org/display/Build/Supported+Build+Platforms > > version 17.1.1.4 is already clang15 (at least that's what the compiler output is telling me) > > /opt/IBM/openxlC/17.1.1/bin/ibm-clang++_r -v IBM Open XL C/C++ for AIX 17.1.1 (5725-C72, 5765-J18), version 17.1.1.4, clang version 15.0.0 (build ca7115e) Target: powerpc-ibm-aix7.2.0.0 My mistake, you are correct. 17.1.0 seems to be clang 13, 17.1.1 seems to be clang 15, and 17.1.2 seems to be clang 17. All of those are based on the documented value of the `__VERSION__` macro's string value. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1888731423 From duke at openjdk.org Fri Jan 12 09:30:53 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 12 Jan 2024 09:30:53 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v26] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Make it fail with 2MB huge pages 2MB huge pages are easier to form. After becoming shredded, there would still be a small number of 2MB huge pages. Thus this commit enlarges the boundary of usage from 0 to the half of the heap. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/c66f2dfd..97fd9c60 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=24-25 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From stefank at openjdk.org Fri Jan 12 09:35:27 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Jan 2024 09:35:27 GMT Subject: Integrated: 8322957: Generational ZGC: Relocation selection must join the STS In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 08:49:38 GMT, Stefan Karlsson wrote: > The concurrent ZGC threads don't automatically participate in the safepoint protocol, which means that they can run concurrently with safepoint VM Operations. Instead they use other means to hook into the safepoint protocol whenever they need to make changes that could be racing with the various VM Operations. The most common way is to join the "suspendible thread set". For details around this see `SafepointSynchronize::begin` and the call to `Universe::heap()->safepoint_synchronize_begin()`. > > It turns out that the relocation selection phase was updated to use a call oop_iterate, to modify oops of some of the objects. This was done without having the GC threads join the suspendible thread set. This means that various VM Operations could run concurrently with the oop_iterate. This caused the failure described in JDK-8322957: The JFR Leak Profiler modified the object header bits, while the GC's oop_iterate function used the same bits to determine if the oop iteration over an object should be skipped. This lead to objects not being modified as they were supposed to, which lead to broken oops and asserts. > > The fix is quite small and could be limited to the lines added to [src/hotspot/share/gc/z/zRelocationSet.cpp](https://github.com/openjdk/jdk/compare/master...stefank:jdk:8322957_sts_with_relocation_selection?expand=1#diff-883b7a72f757c1c5331769ad4a5c763335d0267ee33a0bc06896fa16d89ea58f). However, to lower the risk of reintroducing a bug like this again, we've added extra verification code. Some of the infrastructure to get the correct verification is placed outside of the GC code, and that's why this PR is sent to the hotspot-dev list. > > This has been tested with the reproducer of the original bug + tier1-7 on linux-x64-debug. This pull request has now been integrated. Changeset: ba23025c Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/ba23025cd8a9c1af37afea6444ce5ea2ff41e5af Stats: 168 lines in 14 files changed: 127 ins; 20 del; 21 mod 8322957: Generational ZGC: Relocation selection must join the STS Co-authored-by: Axel Boldt-Christmas Reviewed-by: eosterlund, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/17368 From stuefe at openjdk.org Fri Jan 12 09:50:30 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Jan 2024 09:50:30 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: <4T3a0kGRUX3kJUTNdl7CDzFEVsTJLm9QOK1wnDyhCWI=.30916e31-9323-459b-af38-ac1b39b784e5@github.com> On Fri, 12 Jan 2024 09:15:47 GMT, Liming Liu wrote: >> test/hotspot/jtreg/runtime/os/TestTransparentHugePageUsage.java line 46: >> >>> 44: import java.util.regex.Matcher; >>> 45: import java.util.regex.Pattern; >>> 46: import jdk.test.lib.process.ProcessTools; >> >> Please add a comment describing what the test does. E.g. "Tests checks that a pretouched java heap appears to use THPs by checking AnonHugePages in smaps". Feel free to find a better formulation. >> >> Does the test fail without madvise, at least sporadically? >> >> So I wonder whether it would be better to run with SerialGC, to limit the pretouching to one thread. That would increase the time window needed for pretouching and give us a higher chance to observe those small pages that appear before khugepaged gets around merging them into THPs. > > It almost certainly fails with 512MB huge pages and without madvise when vm_page_size is 64KB, but does not likely fail with 2MB huge pages. I would change the check of usage from 0 to 524288 (512MB in KB), and then it would fail with 2MB huge pages. That is a good idea. We don't need perfection, just a test that has a reasonable chance of informing us when someone breaks your logic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1450149130 From stuefe at openjdk.org Fri Jan 12 09:53:29 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Jan 2024 09:53:29 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Fri, 12 Jan 2024 06:30:03 GMT, Liming Liu wrote: >> src/hotspot/share/runtime/os.cpp line 2125: >> >>> 2123: for (char* cur = static_cast(first); /* break */; cur += page_size) { >>> 2124: Atomic::add(reinterpret_cast(cur), 0, memory_order_relaxed); >>> 2125: if (cur >= last) break; >> >> I suggest a slightly different flow, similar to how we do things in other areas: >> >> os.hpp >> >> >> private: >> bool pd_pretouch_memory(..); >> public: >> void pretouch_memory(..); >> >> >> os.cpp >> >> void os::pretouch_memory(..) { >> // Ask platform first >> if (pd_pretouch_memory(..)) { >> return; >> } >> ... do pretouching by touching >> } >> >> >> then provide a pd_pretouch for every platform; let other platforms be just a noop returning false, on Linux - if THPs are enabled and so forth, do the madvise and return true. >> >> One function less in the os namespace, and we don't call back from a pd_... function into a generic function which is unusual. > > Linux requires a different page size for touching when THP is enabled. Should I change page_size to a reference or let pd_pretouch_memory return the page size for touching rather than a false? Oh, you are right. Yes, that makes sense, return the pagesize as "touching pagesize overrides huge page size". Alternatively, return a boolean that says "touch with system page size, not with the assumed large page size". Up to you. Also up to you if you do this as a separate return value (as in `bool os::pd_pretouch_memory(.., .., bool& use_system_page_size);` or whether you do this via return value, e.g. as in "returns 0 if pretouching was done, >0 if not done, in which case its the page size to use". (While writing this, I feel that a separate parameter is cleaner, but up to you). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1450153626 From ayang at openjdk.org Fri Jan 12 09:54:23 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Jan 2024 09:54:23 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved [v2] In-Reply-To: <9L5S-mj1VlGMSrAvwoI2H8pLiFIfPwVEXZRzOm538sQ=.3a015f57-b7d1-4c6e-b1d9-df7c9f83033a@github.com> References: <9L5S-mj1VlGMSrAvwoI2H8pLiFIfPwVEXZRzOm538sQ=.3a015f57-b7d1-4c6e-b1d9-df7c9f83033a@github.com> Message-ID: On Mon, 8 Jan 2024 16:41:59 GMT, Albert Mingkun Yang wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Move assert to GC-specific code > > src/hotspot/share/gc/g1/g1FullGCCompactionPoint.cpp line 106: > >> 104: // Store a forwarding pointer if the object should be moved. >> 105: if (cast_from_oop(object) != _compaction_top) { >> 106: preserved_stack()->push_if_necessary(object, object->mark()); > > Can this be made conditionally on whether the markword is NOT marked/forwarded? (IOW, move the added predicate in `markWord` here.) The rationale is to minimize changes to the shared code for G1 specific usages. My previous msg was probably unclear... I meant sth like: if (!object->is_forwarded) { preserved_stack()->push_if_necessary... } (The newly added assert here is incorrect -- fails in GHA.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17159#discussion_r1450155749 From sspitsyn at openjdk.org Fri Jan 12 10:17:19 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Jan 2024 10:17:19 GMT Subject: [jdk22] RFR: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 00:47:20 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a clean backport of commit [2806adee](https://github.com/openjdk/jdk/commit/2806adee2d8cca6bc215f285888631799bd02eac) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 10 Jan 2024 and was reviewed by Alex Menkov and Chris Plummer. > > Thanks! Alex, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/64#issuecomment-1888811987 From sspitsyn at openjdk.org Fri Jan 12 10:20:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Jan 2024 10:20:22 GMT Subject: [jdk22] Integrated: 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 00:47:20 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a clean backport of commit [2806adee](https://github.com/openjdk/jdk/commit/2806adee2d8cca6bc215f285888631799bd02eac) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 10 Jan 2024 and was reviewed by Alex Menkov and Chris Plummer. > > Thanks! This pull request has now been integrated. Changeset: d3f18d04 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk22/commit/d3f18d0469d2eafbcfa527358c2817df24fde2c3 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod 8321685: Missing ResourceMark in code called from JvmtiEnvBase::get_vthread_jvf Reviewed-by: amenkov Backport-of: 2806adee2d8cca6bc215f285888631799bd02eac ------------- PR: https://git.openjdk.org/jdk22/pull/64 From sspitsyn at openjdk.org Fri Jan 12 10:26:18 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Jan 2024 10:26:18 GMT Subject: [jdk22] RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> References: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> Message-ID: On Fri, 12 Jan 2024 00:53:47 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a clean backport of commit [aff659aa](https://github.com/openjdk/jdk/commit/aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 21 Dec 2023 and was reviewed by David Holmes and Alan Bateman. > > Thanks! Thank you for review, David! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/65#issuecomment-1888826751 From ihse at openjdk.org Fri Jan 12 10:56:25 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 12 Jan 2024 10:56:25 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v5] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:04:35 GMT, Julian Waters wrote: > I can't believe I missed something so obvious Don't blame yourself. No-one has noticed for the at least 3 years the code has been present. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1888874780 From ayang at openjdk.org Fri Jan 12 10:59:45 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Jan 2024 10:59:45 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v12] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 15:46:45 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > restore and rename 'GenCollectedHeap' to 'SerialHeap' The followup ticket: https://bugs.openjdk.org/browse/JDK-8323660 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1888878634 From duke at openjdk.org Fri Jan 12 10:59:47 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 12 Jan 2024 10:59:47 GMT Subject: Integrated: 8234502: Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:47:28 GMT, Lei Zaakjyu wrote: > 8234502: Merge GenCollectedHeap and SerialHeap This pull request has now been integrated. Changeset: 7dc9dd6f Author: Lei Zaakjyu Committer: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/7dc9dd6fdf500bb5156983097bc399d286407afb Stats: 3116 lines in 21 files changed: 1509 ins; 1579 del; 28 mod 8234502: Merge GenCollectedHeap and SerialHeap Reviewed-by: ayang, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/16927 From sspitsyn at openjdk.org Fri Jan 12 12:00:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Jan 2024 12:00:22 GMT Subject: [jdk22] Integrated: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> References: <7qecwb5Uue8trLTpsy7TWsvuHwc3bGky6qIYR0vNfws=.2238f1ab-b619-4ea5-abc0-d1b74011ffa5@github.com> Message-ID: On Fri, 12 Jan 2024 00:53:47 GMT, Serguei Spitsyn wrote: > Hi all, > > This pull request contains a clean backport of commit [aff659aa](https://github.com/openjdk/jdk/commit/aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Serguei Spitsyn on 21 Dec 2023 and was reviewed by David Holmes and Alan Bateman. > > Thanks! This pull request has now been integrated. Changeset: 71a05bf0 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk22/commit/71a05bf03f4789f04cdba205c4fd3dc6d2dd0a65 Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI Reviewed-by: dholmes Backport-of: aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a ------------- PR: https://git.openjdk.org/jdk22/pull/65 From rkennke at openjdk.org Fri Jan 12 12:02:37 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 12 Jan 2024 12:02:37 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved [v3] In-Reply-To: References: Message-ID: > The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. > The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. > > Testing: > - [x] hotspot_gc > - [x] tier1 > - [ ] tier2 Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Cleanup - Move forwarded-predicate into G1 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17159/files - new: https://git.openjdk.org/jdk/pull/17159/files/32625570..d570e046 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17159&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17159&range=01-02 Stats: 4 lines in 2 files changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17159.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17159/head:pull/17159 PR: https://git.openjdk.org/jdk/pull/17159 From ayang at openjdk.org Fri Jan 12 12:09:20 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Jan 2024 12:09:20 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved [v3] In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 12:02:37 GMT, Roman Kennke wrote: >> The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. >> The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. >> >> Testing: >> - [x] hotspot_gc >> - [x] tier1 >> - [ ] tier2 > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Cleanup > - Move forwarded-predicate into G1 Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17159#pullrequestreview-1818090942 From rkennke at openjdk.org Fri Jan 12 12:14:19 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 12 Jan 2024 12:14:19 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved [v3] In-Reply-To: References: <9L5S-mj1VlGMSrAvwoI2H8pLiFIfPwVEXZRzOm538sQ=.3a015f57-b7d1-4c6e-b1d9-df7c9f83033a@github.com> Message-ID: On Fri, 12 Jan 2024 09:52:08 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/g1/g1FullGCCompactionPoint.cpp line 106: >> >>> 104: // Store a forwarding pointer if the object should be moved. >>> 105: if (cast_from_oop(object) != _compaction_top) { >>> 106: preserved_stack()->push_if_necessary(object, object->mark()); >> >> Can this be made conditionally on whether the markword is NOT marked/forwarded? (IOW, move the added predicate in `markWord` here.) The rationale is to minimize changes to the shared code for G1 specific usages. > > My previous msg was probably unclear... I meant sth like: > > > if (!object->is_forwarded) { > preserved_stack()->push_if_necessary... > } > > > (The newly added assert here is incorrect -- fails in GHA.) Oh I see! This makes more sense, yes. I made that change and cleaned up non G1 code. BTW: I am working on new/alternative full-GCs for Serial, G1 and Shenandoah that don't store forwarding pointers in the mark-word at all, and thus avoid all that preserved-headers stuff (and together with the OM-world work that some Oracle engineers are working on, get rid of header displacement altogether). The SerialGC version is almost ready in Lilliput: https://github.com/openjdk/lilliput/pull/122 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17159#discussion_r1450351340 From aboldtch at openjdk.org Fri Jan 12 14:37:11 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 12 Jan 2024 14:37:11 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v12] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 15:34:34 GMT, Axel Boldt-Christmas wrote: >> LM_LIGHTWEIGHT only uses the lock bits for its locking. This leaves the hashCode bits free when a monitor is not inflated. So instead of inflating when installing the hashCode on a fast locked object it can simply use the hashCode bits in the markWord. >> >> The mark word transitions Unlocked (0b01) <=> Locked (0b00) are done by retrying the CAS if it fails due to non-lock bit changes. >> The mark word transitions Monitor (0b10) <=> Locked/Unlocked (0b0X) are the same as before, inflation already handles hash codes. This change does not interact with the mark word if it is in a Monitor (0b10) state, so the strong CAS which is used for deflation are still valid, and will not fail to any other reason than the cooperative race to help transition the mark word during deflation. >> >> This is dependent on JDK-8319778 simply because JDK-8319797 is dependent on both this and JDK-8319778. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 > - Fix copy paste typo. > - Update src/hotspot/share/opto/library_call.cpp > > Co-authored-by: Tobias Hartmann > - Add retry CAS comment > - Use is_neutral over is_unlocked > - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 > - ... and 8 more: https://git.openjdk.org/jdk/compare/dfa488c1...a83ad377 Ran tier1-5 all Oracle platforms with both LM_LEGACY and LM_LIGHTWEIGHT after the merge. Will go ahead and integrate this. I'll create an RFE for the arm32 hashcode optimisation as well as for the re-inflate race. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16603#issuecomment-1889370923 From aboldtch at openjdk.org Fri Jan 12 14:37:13 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 12 Jan 2024 14:37:13 GMT Subject: Integrated: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT In-Reply-To: References: Message-ID: <95mMp9TNst2ryk54ZeAmsmazEbH3mDGd_JrCt6Dnbos=.efc02f6a-92ec-4aa3-ba14-ec5ed569e2d6@github.com> On Fri, 10 Nov 2023 12:18:29 GMT, Axel Boldt-Christmas wrote: > LM_LIGHTWEIGHT only uses the lock bits for its locking. This leaves the hashCode bits free when a monitor is not inflated. So instead of inflating when installing the hashCode on a fast locked object it can simply use the hashCode bits in the markWord. > > The mark word transitions Unlocked (0b01) <=> Locked (0b00) are done by retrying the CAS if it fails due to non-lock bit changes. > The mark word transitions Monitor (0b10) <=> Locked/Unlocked (0b0X) are the same as before, inflation already handles hash codes. This change does not interact with the mark word if it is in a Monitor (0b10) state, so the strong CAS which is used for deflation are still valid, and will not fail to any other reason than the cooperative race to help transition the mark word during deflation. > > This is dependent on JDK-8319778 simply because JDK-8319797 is dependent on both this and JDK-8319778. This pull request has now been integrated. Changeset: 65a06727 Author: Axel Boldt-Christmas URL: https://git.openjdk.org/jdk/commit/65a0672791f868556776fc435b37319ed69f7c84 Stats: 76 lines in 4 files changed: 29 ins; 23 del; 24 mod 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT Reviewed-by: rkennke, dcubed, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/16603 From stuefe at openjdk.org Fri Jan 12 17:37:20 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Jan 2024 17:37:20 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 09:17:35 GMT, Aleksey Shipilev wrote: >> We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. >> >> There are a few interesting conversions along the way: >> 1. `intptr_t` -> `uint32_t` (this method) >> 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) >> 3. `int32_t` -> `uint32_t` (in `emit_int32`) >> >> I believe these are safe after `is_uimm32` check, but please check (sic) me on this. >> >> Note that x86_64 matcher already does similar thing for immediates: >> >> >> // Long Immediate 32-bit unsigned >> operand immUL32() >> %{ >> predicate(n->get_long() == (unsigned int) (n->get_long())); >> match(ConL); >> ... >> %} >> >> instruct loadConUL32(rRegL dst, immUL32 src) >> %{ >> ... >> format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} >> ins_encode %{ >> __ movl($dst$$Register, $src$$constant); >> %} >> %} >> >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> >> Code sizes for `Hello World`, `-Xcomp`: >> >> >> # Before >> tier1 nmethod code size : 426208 bytes >> tier2 nmethod code size : 462880 bytes >> tier3 nmethod code size : 889992 bytes >> tier4 nmethod code size : 1244448 bytes >> >> # After >> tier1 nmethod code size : 425768 bytes (-0.1%) >> tier2 nmethod code size : 462400 bytes (-0.1%) >> tier3 nmethod code size : 882072 bytes (-0.8%) >> tier4 nmethod code size : 1236448 bytes (-0.6%) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Just do checked_cast" > > This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. Tests on my issue are green with your variant of movptr ------------- PR Comment: https://git.openjdk.org/jdk/pull/17343#issuecomment-1889698975 From amenkov at openjdk.org Fri Jan 12 20:44:33 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 12 Jan 2024 20:44:33 GMT Subject: Integrated: JDK-8318563: GetClassFields should not use random access to field In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 21:32:50 GMT, Alex Menkov wrote: > FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. > It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. > > FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) > > Testing: > - tier1..3 > - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic > including > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; > - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. This pull request has now been integrated. Changeset: 84cf4cb3 Author: Alex Menkov URL: https://git.openjdk.org/jdk/commit/84cf4cb350331aac147fdf4c6d130cdf5448c987 Stats: 52 lines in 2 files changed: 35 ins; 8 del; 9 mod 8318563: GetClassFields should not use random access to field Reviewed-by: sspitsyn, cjplummer, fparain ------------- PR: https://git.openjdk.org/jdk/pull/17094 From duke at openjdk.org Sat Jan 13 02:27:28 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 13 Jan 2024 02:27:28 GMT Subject: RFR: 8323693: Update some copyright announcements in the new files created in 8234502 Message-ID: see 'https://github.com/openjdk/jdk/pull/17398#pullrequestreview-1819022919' ------------- Commit messages: - update copyright Changes: https://git.openjdk.org/jdk/pull/17412/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17412&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323693 Stats: 18 lines in 18 files changed: 0 ins; 0 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/17412.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17412/head:pull/17412 PR: https://git.openjdk.org/jdk/pull/17412 From cjplummer at openjdk.org Sun Jan 14 02:56:18 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Sun, 14 Jan 2024 02:56:18 GMT Subject: RFR: 8323693: Update some copyright announcements in the new files created in 8234502 In-Reply-To: References: Message-ID: On Sat, 13 Jan 2024 02:21:37 GMT, Lei Zaakjyu wrote: > see 'https://github.com/openjdk/jdk/pull/17398#pullrequestreview-1819022919' Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17412#pullrequestreview-1820135652 From duke at openjdk.org Sun Jan 14 07:00:21 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Sun, 14 Jan 2024 07:00:21 GMT Subject: RFR: 8323693: Update some copyright announcements in the new files created in 8234502 In-Reply-To: References: Message-ID: On Sat, 13 Jan 2024 02:21:37 GMT, Lei Zaakjyu wrote: > see 'https://github.com/openjdk/jdk/pull/17398#pullrequestreview-1819022919' This patch seems trivial. I wonder if it needs two reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17412#issuecomment-1890866200 From dholmes at openjdk.org Sun Jan 14 22:17:50 2024 From: dholmes at openjdk.org (David Holmes) Date: Sun, 14 Jan 2024 22:17:50 GMT Subject: [jdk22] RFR: 8323243: JNI invocation of an abstract instance method corrupts the stack Message-ID: Hi all, This pull request contains a backport of commit [71d9a83d](https://github.com/openjdk/jdk/commit/71d9a83dece7eb4bdb6ffdd9caf14a1348045ce0) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by David Holmes on 14 Jan 2024 and was reviewed by Coleen Phillimore and Aleksey Shipilev. Thanks! ------------- Commit messages: - Backport 71d9a83dece7eb4bdb6ffdd9caf14a1348045ce0 Changes: https://git.openjdk.org/jdk22/pull/73/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=73&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323243 Stats: 159 lines in 4 files changed: 159 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/73.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/73/head:pull/73 PR: https://git.openjdk.org/jdk22/pull/73 From dholmes at openjdk.org Mon Jan 15 02:09:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 15 Jan 2024 02:09:21 GMT Subject: RFR: 8323693: Update some copyright announcements in the new files created in 8234502 In-Reply-To: References: Message-ID: On Sat, 13 Jan 2024 02:21:37 GMT, Lei Zaakjyu wrote: > see 'https://github.com/openjdk/jdk/pull/17398#pullrequestreview-1819022919' Yes this could be considered trivial, but Reviewed anyway. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17412#pullrequestreview-1820494931 From david.holmes at oracle.com Mon Jan 15 02:48:53 2024 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Jan 2024 12:48:53 +1000 Subject: static_cast(0) vs do while In-Reply-To: References: Message-ID: <14081f63-6fe3-4f81-8248-fd1518bf6fad@oracle.com> Hi Julian, On 12/01/2024 3:24 pm, Julian Waters wrote: > Hi all, > > In my personal fork of HotSpot I have the following commit > https://github.com/TheShermanTanker/jdk/commit/54131b70d40a88ab4176d23821f4c32044c0043d > which replaces some occurrences of do while with a discarding > static_cast (static_cast(0) is a nop), to avoid inefficiencies > in the compiled code for debug mode, when optimizations are turned off > (Namely a compare and jump back to the start of the loop on a > condition that is always false). Is this minor optimization worth > committing upstream to HotSpot? It should also mean that the compiled > code is clearer when the methods containing asserts are disassembled. Sorry that just looks weird to me - I'm thinking "what on earth is a cast doing here?". `do { ... } while (0)` is a long standing idiomatic way to write a multi-line macro so that it can be used as a statement. I would think any compiler worth its salt would see the loop never repeats and just discard it. Cheers, David > best regards, > Julian From duke at openjdk.org Mon Jan 15 05:06:18 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Mon, 15 Jan 2024 05:06:18 GMT Subject: RFR: 8323693: Update some copyright announcements in the new files created in 8234502 In-Reply-To: References: Message-ID: <4CHz4gTyf8uM5dokgRsdrN3IFgMpZ3xsIy9Wb8MijGg=.c43f25cc-0e29-40dd-b24c-aaff7bde9169@github.com> On Sat, 13 Jan 2024 02:21:37 GMT, Lei Zaakjyu wrote: > see 'https://github.com/openjdk/jdk/pull/17398#pullrequestreview-1819022919' thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17412#issuecomment-1891307949 From duke at openjdk.org Mon Jan 15 06:45:54 2024 From: duke at openjdk.org (Liming Liu) Date: Mon, 15 Jan 2024 06:45:54 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v27] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 35 commits: - Use the suggested structure and delect MADV_POPULATE_WRITE during initialization - Make it fail with 2MB huge pages 2MB huge pages are easier to form. After becoming shredded, there would still be a small number of 2MB huge pages. Thus this commit enlarges the boundary of usage from 0 to the half of the heap. - Reduce a return - untabify - Correct the fallback case - Add an option to be able not to use MADV_POPULATE_WRITE - Use serial GC for the test - Use pthread instead - Remove the deletion - Fix type errors - ... and 25 more: https://git.openjdk.org/jdk/compare/bdee968e...cf617f21 ------------- Changes: https://git.openjdk.org/jdk/pull/15781/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=26 Stats: 245 lines in 10 files changed: 230 ins; 7 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From duke at openjdk.org Mon Jan 15 06:50:49 2024 From: duke at openjdk.org (Liming Liu) Date: Mon, 15 Jan 2024 06:50:49 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v28] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Untabify ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/cf617f21..af5c5dc5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=26-27 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From johannes.bechberger at sap.com Mon Jan 15 06:51:13 2024 From: johannes.bechberger at sap.com (Bechberger, Johannes) Date: Mon, 15 Jan 2024 06:51:13 +0000 Subject: static_cast(0) vs do while In-Reply-To: <14081f63-6fe3-4f81-8248-fd1518bf6fad@oracle.com> References: <14081f63-6fe3-4f81-8248-fd1518bf6fad@oracle.com> Message-ID: Hi David and Julian, I just checked it with GCC (https://godbolt.org/z/qcYjn6dn9) and Clang (https://godbolt.org/z/zvfnba5Tf) in Godbolt: GCC without any optimizations discards the loop, Clang replaces it with a single jump, MSVC (https://godbolt.org/z/saj1sreen) and ICC (https://godbolt.org/z/saj1sreenproduce) a full loop. The change is therefore only a minor optimization on MSVC and ICC. Regards Johannes From: hotspot-dev on behalf of David Holmes Date: Monday, 15. January 2024 at 03:49 To: hotspot-dev at openjdk.org Subject: Re: static_cast(0) vs do while [You don't often get email from david.holmes at oracle.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Hi Julian, On 12/01/2024 3:24 pm, Julian Waters wrote: > Hi all, > > In my personal fork of HotSpot I have the following commit > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FTheShermanTanker%2Fjdk%2Fcommit%2F54131b70d40a88ab4176d23821f4c32044c0043d&data=05%7C02%7Cjohannes.bechberger%40sap.com%7C87b0fb3aa6d44bb13f8208dc15749167%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C638408837602750422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dTWgX8aIQsZuS33M%2F047H4uu3L4ZRzQGIiwh%2FfKqdhc%3D&reserved=0 > which replaces some occurrences of do while with a discarding > static_cast (static_cast(0) is a nop), to avoid inefficiencies > in the compiled code for debug mode, when optimizations are turned off > (Namely a compare and jump back to the start of the loop on a > condition that is always false). Is this minor optimization worth > committing upstream to HotSpot? It should also mean that the compiled > code is clearer when the methods containing asserts are disassembled. Sorry that just looks weird to me - I'm thinking "what on earth is a cast doing here?". `do { ... } while (0)` is a long standing idiomatic way to write a multi-line macro so that it can be used as a statement. I would think any compiler worth its salt would see the loop never repeats and just discard it. Cheers, David > best regards, > Julian -------------- next part -------------- An HTML attachment was scrubbed... URL: From duke at openjdk.org Mon Jan 15 07:43:35 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Mon, 15 Jan 2024 07:43:35 GMT Subject: Integrated: 8323693: Update some copyright announcements in the new files created in 8234502 In-Reply-To: References: Message-ID: On Sat, 13 Jan 2024 02:21:37 GMT, Lei Zaakjyu wrote: > see 'https://github.com/openjdk/jdk/pull/17398#pullrequestreview-1819022919' This pull request has now been integrated. Changeset: 922f8e44 Author: Lei Zaakjyu Committer: David Holmes URL: https://git.openjdk.org/jdk/commit/922f8e44eed74b79a76a3628ebd0bca144e28091 Stats: 18 lines in 18 files changed: 0 ins; 0 del; 18 mod 8323693: Update some copyright announcements in the new files created in 8234502 Reviewed-by: cjplummer, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17412 From aboldtch at openjdk.org Mon Jan 15 07:55:32 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 15 Jan 2024 07:55:32 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v11] In-Reply-To: References: Message-ID: > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Avoid copy from and to the same location - Fix typo - Update unstructured unlock comment - Fix bad indent after merge - ... and 22 more: https://git.openjdk.org/jdk/compare/922f8e44...a4e372aa ------------- Changes: https://git.openjdk.org/jdk/pull/16606/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=10 Stats: 676 lines in 10 files changed: 634 ins; 7 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From epeter at openjdk.org Mon Jan 15 09:00:10 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 09:00:10 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Thu, 11 Jan 2024 13:43:25 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed typo > > src/hotspot/share/ci/ciMethodData.cpp line 96: > >> 94: // a safepoint. We temporarily release the lock and allow >> 95: // safepoints, and revert that at the end of the scope: >> 96: MutexUnlocker mu(_mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); > > Why is this safe? @rwestrel I still believe this is safe. But maybe also ugly. I looked into making the locking more fine-grained, so that we could avoid unlocking the lock temporarily. The biggest problem is in `ciMethodData::load_remaining_extra_data`. Here we first (iteratively) clean, and then assume that we still hold the lock when we copy it for the `ciMethodData`. Hence, it seems the lock has to be held at this outer scope, but then temporarily unlocked to allow calls to `get_method` in `PrepareExtraDataClosure::finish`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452084643 From shade at openjdk.org Mon Jan 15 09:13:22 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 09:13:22 GMT Subject: RFR: 8323519: Add applications/ctw/modules to Hotspot tiered testing [v2] In-Reply-To: References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Message-ID: On Thu, 11 Jan 2024 08:56:47 GMT, Aleksey Shipilev wrote: >> Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. It should be useful to catch compiler bugs early, before running `hotspot:all`. >> >> These tests take quite a bit of time (~15 mins on my M1), so I opted to add them to relevant "slow" group that is run in tier3. >> >> Additional testing: >> - [x] Checked that `tier3_compiler` runs CTW tests > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Adding directly to tier3_compiler Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17348#issuecomment-1891631744 From shade at openjdk.org Mon Jan 15 09:13:25 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 09:13:25 GMT Subject: Integrated: 8323519: Add applications/ctw/modules to Hotspot tiered testing In-Reply-To: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> References: <3jbVl4E9L0TXjD3aRnQJ3Y81wqf5nnOUXOl2cI9UVns=.32290c86-9dda-4f32-927e-6f673a20ebec@github.com> Message-ID: On Wed, 10 Jan 2024 13:43:23 GMT, Aleksey Shipilev wrote: > Noticed that `applications/ctw/modules` is missing from current `tier{1,2,3,4}` hotspot definitions, since tier4 specifically excludes all applications. That exclusion was due to potentially unprepared dependencies that are needed for testing: for example jcstress tests would fail with the default configuration. But CTW for JDK modules works well out of the box, so we can add it somewhere in high tier, for example tier3. It should be useful to catch compiler bugs early, before running `hotspot:all`. > > These tests take quite a bit of time (~15 mins on my M1), so I opted to add them to relevant "slow" group that is run in tier3. > > Additional testing: > - [x] Checked that `tier3_compiler` runs CTW tests This pull request has now been integrated. Changeset: ba3c3bbd Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/ba3c3bbd879eaf7532663663d73e21fafc65b574 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8323519: Add applications/ctw/modules to Hotspot tiered testing Reviewed-by: xliu, kvn ------------- PR: https://git.openjdk.org/jdk/pull/17348 From stefank at openjdk.org Mon Jan 15 10:04:41 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 15 Jan 2024 10:04:41 GMT Subject: [jdk22] RFR: 8322957: Generational ZGC: Relocation selection must join the STS Message-ID: Hi all, This pull request contains a backport of commit [ba23025c](https://github.com/openjdk/jdk/commit/ba23025cd8a9c1af37afea6444ce5ea2ff41e5af) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Stefan Karlsson on 12 Jan 2024 and was reviewed by Erik ?sterlund and Axel Boldt-Christmas. Thanks! ------------- Commit messages: - Backport ba23025cd8a9c1af37afea6444ce5ea2ff41e5af Changes: https://git.openjdk.org/jdk22/pull/74/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=74&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322957 Stats: 168 lines in 14 files changed: 127 ins; 20 del; 21 mod Patch: https://git.openjdk.org/jdk22/pull/74.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/74/head:pull/74 PR: https://git.openjdk.org/jdk22/pull/74 From stefank at openjdk.org Mon Jan 15 10:22:38 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 15 Jan 2024 10:22:38 GMT Subject: RFR: 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC Message-ID: Don't print the ZGC Phase Switch hs_err section when the JVM is run with other GCs than ZGC. I've tested this manually with `-XX:ErrorHandlerTest=3 -version` and verified that the section is logged when ZGC is used, and that it is not logged when G1 is used. ------------- Commit messages: - 8323716: Only print ZGC Phase Switch events in hs_err files when runing with ZGC Changes: https://git.openjdk.org/jdk/pull/17420/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17420&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323716 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17420/head:pull/17420 PR: https://git.openjdk.org/jdk/pull/17420 From aboldtch at openjdk.org Mon Jan 15 10:26:23 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 15 Jan 2024 10:26:23 GMT Subject: RFR: 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:16:12 GMT, Stefan Karlsson wrote: > Don't print the ZGC Phase Switch hs_err section when the JVM is run with other GCs than ZGC. > > I've tested this manually with `-XX:ErrorHandlerTest=3 -version` and verified that the section is logged when ZGC is used, and that it is not logged when G1 is used. lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17420#pullrequestreview-1821392329 From epeter at openjdk.org Mon Jan 15 10:26:24 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 10:26:24 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Mon, 15 Jan 2024 08:57:35 GMT, Emanuel Peter wrote: >> src/hotspot/share/ci/ciMethodData.cpp line 96: >> >>> 94: // a safepoint. We temporarily release the lock and allow >>> 95: // safepoints, and revert that at the end of the scope: >>> 96: MutexUnlocker mu(_mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); >> >> Why is this safe? > > @rwestrel I still believe this is safe. But maybe also ugly. > > I looked into making the locking more fine-grained, so that we could avoid unlocking the lock temporarily. > The biggest problem is in `ciMethodData::load_remaining_extra_data`. Here we first (iteratively) clean, and then assume that we still hold the lock when we copy it for the `ciMethodData`. Hence, it seems the lock has to be held at this outer scope, but then temporarily unlocked to allow calls to `get_method` in `PrepareExtraDataClosure::finish`. Alternatives to make it prettier: Make `prepare_metadata` lock, and pass out an object that holds that lock, i.e. widen the scope of the `MutexLocker`. Maybe this can be done with return-value-optimization? But I'm not sure this is a great idea. Another idea @chhagedorn and I thought about was having some Locker object that you can call lock/unlock on, repeatedly. But once the Locker goes out of scope, it checks if it is in the locked state, and only unlocks then. Or maybe it asserts that it is in the locked state, and then unlocks. Because essencially we need to allow the retry-logic to unlock in between tries. But we also still need to access the `uncached_methods` array that is filled inside the locked region. I'm not sure such a refactoring is worth it. Let me know what you think @rwestrel ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452186783 From epeter at openjdk.org Mon Jan 15 10:29:22 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 10:29:22 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Thu, 11 Jan 2024 15:59:47 GMT, Emanuel Peter wrote: >> src/hotspot/share/ci/ciMethodData.cpp line 135: >> >>> 133: >>> 134: // Lock to read ProfileData, and ensure lock is not unintentionally broken by a safepoint >>> 135: MutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); >> >> Is there anyway to have MutexLocker take care of verifying that the there's no safepoint? It would be nice to replace: >> >> >> MutexLocker ml(); >> NoSafepointVerifier no_safepoint; >> >> >> by: >> >> >> MutexLocker ml(); >> >> >> only. > > @fisk @tkrodriguez what do you suggest for that? Should I create a wrapper object for it? Maybe a `MutexLockerAndNoSafepointVerifier`? Would I place that simply in `methodData.hpp`? Or make it available more widely? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452191879 From shade at openjdk.org Mon Jan 15 11:06:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 11:06:33 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies Message-ID: Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to run all tests in hotspot:tier4, which now excludes `applications/` specifically. I provisionally call this flag `external-dep`, but I am open for other suggestions. Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. Additional testing: - [x] `make test TEST=applications/` fails - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests ------------- Commit messages: - Initial work Changes: https://git.openjdk.org/jdk/pull/17421/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17421&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323717 Stats: 62 lines in 32 files changed: 32 ins; 0 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/17421.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17421/head:pull/17421 PR: https://git.openjdk.org/jdk/pull/17421 From epeter at openjdk.org Mon Jan 15 11:57:34 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 11:57:34 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v17] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/share/runtime/deoptimization.cpp rm empty line - change patch to deoptimization.cpp case brought up by Roland ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/e1e91741..671ead28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=15-16 Stats: 29 lines in 1 file changed: 15 ins; 13 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Mon Jan 15 11:57:36 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 11:57:36 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v17] In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 11:55:05 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Complications with ttyl** >> There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. >> >> If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: > > - Update src/hotspot/share/runtime/deoptimization.cpp > > rm empty line > - change patch to deoptimization.cpp case brought up by Roland src/hotspot/share/runtime/deoptimization.cpp line 2337: > 2335: NoSafepointVerifier no_safepoint; > 2336: ProfileData* pdata = nullptr; > 2337: Suggestion: ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452284238 From epeter at openjdk.org Mon Jan 15 11:57:38 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 11:57:38 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Thu, 11 Jan 2024 13:33:29 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed typo > > src/hotspot/share/runtime/deoptimization.cpp line 2406: > >> 2404: reprofile = true; >> 2405: } >> 2406: > > Why is it safe to move this here? @rwestrel I now took a different approach: using a conditional lock, rather than moving the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452282252 From tschatzl at openjdk.org Mon Jan 15 12:23:20 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 15 Jan 2024 12:23:20 GMT Subject: RFR: 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:16:12 GMT, Stefan Karlsson wrote: > Don't print the ZGC Phase Switch hs_err section when the JVM is run with other GCs than ZGC. > > I've tested this manually with `-XX:ErrorHandlerTest=3 -version` and verified that the section is logged when ZGC is used, and that it is not logged when G1 is used. lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17420#pullrequestreview-1821580210 From shade at openjdk.org Mon Jan 15 12:31:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 12:31:21 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 09:17:35 GMT, Aleksey Shipilev wrote: >> We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. >> >> There are a few interesting conversions along the way: >> 1. `intptr_t` -> `uint32_t` (this method) >> 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) >> 3. `int32_t` -> `uint32_t` (in `emit_int32`) >> >> I believe these are safe after `is_uimm32` check, but please check (sic) me on this. >> >> Note that x86_64 matcher already does similar thing for immediates: >> >> >> // Long Immediate 32-bit unsigned >> operand immUL32() >> %{ >> predicate(n->get_long() == (unsigned int) (n->get_long())); >> match(ConL); >> ... >> %} >> >> instruct loadConUL32(rRegL dst, immUL32 src) >> %{ >> ... >> format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} >> ins_encode %{ >> __ movl($dst$$Register, $src$$constant); >> %} >> %} >> >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> >> Code sizes for `Hello World`, `-Xcomp`: >> >> >> # Before >> tier1 nmethod code size : 426208 bytes >> tier2 nmethod code size : 462880 bytes >> tier3 nmethod code size : 889992 bytes >> tier4 nmethod code size : 1244448 bytes >> >> # After >> tier1 nmethod code size : 425768 bytes (-0.1%) >> tier2 nmethod code size : 462400 bytes (-0.1%) >> tier3 nmethod code size : 882072 bytes (-0.8%) >> tier4 nmethod code size : 1236448 bytes (-0.6%) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Just do checked_cast" > > This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. I think this implicitly relies on 2-complement representation for signed types, so that signed->unsigned conversion is sane. But that is already what matcher seems to be doing. I modeled these casts in godbolt, and it made sense, and UBSan did not complain. Let's see if this rubs anyone else the wrong way. (Trying to snipe @theRealAph or @kimbarrett here.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17343#issuecomment-1892083078 From epeter at openjdk.org Mon Jan 15 12:35:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 12:35:59 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v18] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: NoSafepointMutexLocker ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/671ead28..78a2cdb6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=16-17 Stats: 45 lines in 10 files changed: 15 ins; 15 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Mon Jan 15 12:35:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 12:35:59 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Thu, 11 Jan 2024 13:41:11 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed typo > > src/hotspot/share/ci/ciMethodData.cpp line 135: > >> 133: >> 134: // Lock to read ProfileData, and ensure lock is not unintentionally broken by a safepoint >> 135: MutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); > > Is there anyway to have MutexLocker take care of verifying that the there's no safepoint? It would be nice to replace: > > > MutexLocker ml(); > NoSafepointVerifier no_safepoint; > > > by: > > > MutexLocker ml(); > > > only. @rwestrel I introduced a `NoSafepointMutexLocker`, which is composed of both a `ConditionalMutexLocker` and a `NoSafepointVerifier`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452318640 From epeter at openjdk.org Mon Jan 15 12:35:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 15 Jan 2024 12:35:59 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Mon, 15 Jan 2024 12:29:20 GMT, Emanuel Peter wrote: >> src/hotspot/share/ci/ciMethodData.cpp line 135: >> >>> 133: >>> 134: // Lock to read ProfileData, and ensure lock is not unintentionally broken by a safepoint >>> 135: MutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); >> >> Is there anyway to have MutexLocker take care of verifying that the there's no safepoint? It would be nice to replace: >> >> >> MutexLocker ml(); >> NoSafepointVerifier no_safepoint; >> >> >> by: >> >> >> MutexLocker ml(); >> >> >> only. > > @rwestrel I introduced a `NoSafepointMutexLocker`, which is composed of both a `ConditionalMutexLocker` and a `NoSafepointVerifier`. But in this specific instance I keep them separate, because I need to pass the `NoSafepointVerifier` down, so that I can pass it to the `PauseNoSafepointVerifier`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1452320637 From shade at openjdk.org Mon Jan 15 12:39:16 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 12:39:16 GMT Subject: RFR: 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:16:12 GMT, Stefan Karlsson wrote: > Don't print the ZGC Phase Switch hs_err section when the JVM is run with other GCs than ZGC. > > I've tested this manually with `-XX:ErrorHandlerTest=3 -version` and verified that the section is logged when ZGC is used, and that it is not logged when G1 is used. Looks OK. I had to go and look that `_zgc_phase_switch` is initialized to `nullptr` otherwise. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17420#pullrequestreview-1821605309 From aturbanov at openjdk.org Mon Jan 15 13:33:26 2024 From: aturbanov at openjdk.org (Andrey Turbanov) Date: Mon, 15 Jan 2024 13:33:26 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: <3m2_CQE-NHOCN20Z4LbosqwihcUCVopTgycXADInLEI=.25f797e8-e620-4f10-9da0-245a890c41de@github.com> On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 test/jdk/java/lang/StringBuffer/IndexOf.java line 220: > 218: > 219: for (int x = 0; x < 1000000; x++) { > 220: if(make_new) { Suggestion: if (make_new) { test/jdk/java/lang/StringBuffer/IndexOf.java line 262: > 260: } > 261: > 262: if(make_new) Suggestion: if (make_new) test/jdk/java/lang/StringBuffer/IndexOf.java line 295: > 293: } > 294: > 295: if(make_new) testIndex = getRandomIndex(-100, 100); Suggestion: if (make_new) testIndex = getRandomIndex(-100, 100); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1452380458 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1452380633 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1452380799 From aboldtch at openjdk.org Mon Jan 15 15:45:22 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 15 Jan 2024 15:45:22 GMT Subject: [jdk22] RFR: 8322957: Generational ZGC: Relocation selection must join the STS In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 09:57:54 GMT, Stefan Karlsson wrote: > Hi all, > > This pull request contains a backport of commit [ba23025c](https://github.com/openjdk/jdk/commit/ba23025cd8a9c1af37afea6444ce5ea2ff41e5af) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Stefan Karlsson on 12 Jan 2024 and was reviewed by Erik ?sterlund and Axel Boldt-Christmas. > > Thanks! lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk22/pull/74#pullrequestreview-1821928751 From shade at openjdk.org Mon Jan 15 16:13:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 16:13:34 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots Message-ID: Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. % make test TEST=all Test selection 'all', will run: * jtreg:test/hotspot/jtreg:all * jtreg:test/jdk:all * jtreg:test/langtools:all * jtreg:test/jaxp:all * jtreg:test/lib-test:all (...about 6 hours later...) ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >> jtreg:test/jdk:all 9962 9951 11 0 << jtreg:test/langtools:all 4469 4469 0 0 jtreg:test/jaxp:all 513 513 0 0 jtreg:test/lib-test:all 32 32 0 0 ============================== TEST FAILURE ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/17422/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17422&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323515 Stats: 41 lines in 5 files changed: 34 ins; 5 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17422.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17422/head:pull/17422 PR: https://git.openjdk.org/jdk/pull/17422 From tschatzl at openjdk.org Mon Jan 15 16:27:23 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 15 Jan 2024 16:27:23 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved [v3] In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 12:02:37 GMT, Roman Kennke wrote: >> The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. >> The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. >> >> Testing: >> - [x] hotspot_gc >> - [x] tier1 >> - [ ] tier2 > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Cleanup > - Move forwarded-predicate into G1 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17159#pullrequestreview-1821993066 From shade at openjdk.org Mon Jan 15 16:32:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 16:32:21 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: On Mon, 8 Jan 2024 19:57:41 GMT, Dean Long wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Inline new_ic_stub > > Marked as reviewed by dlong (Reviewer). Thanks @dean-long! Any other takers? Notably, I would like opinions of platform maintainers: @theRealAph, @TheRealMDoerr, @RealFYang, @bulasevich. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17277#issuecomment-1892476720 From mdoerr at openjdk.org Mon Jan 15 16:41:21 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 15 Jan 2024 16:41:21 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: On Mon, 8 Jan 2024 09:47:36 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Inline new_ic_stub It's probably good, but are you aware of the plans to remove the ICStubs completely? @fisk may have an opinion on it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17277#issuecomment-1892490410 From shade at openjdk.org Mon Jan 15 16:46:20 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Jan 2024 16:46:20 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: <-9ROjQcXvKT0D1nh73qJDBDfFhiuBk7OVbqQGTpb8EE=.b8190e90-a2e0-41f0-874b-c58c05bca55e@github.com> On Mon, 15 Jan 2024 16:38:35 GMT, Martin Doerr wrote: > It's probably good, but are you aware of the plans to remove the ICStubs completely? @fisk may have an opinion on it. Yes, I did x86_32 version for Erik. I think we want something we can support in JDK update releases. I think the ICStub removal would not be backportable at all. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17277#issuecomment-1892497527 From mdoerr at openjdk.org Mon Jan 15 16:49:21 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 15 Jan 2024 16:49:21 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: On Mon, 8 Jan 2024 09:47:36 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Inline new_ic_stub Ok. I think it's good for backports. I'll put it in our test queue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17277#issuecomment-1892502261 From mli at openjdk.org Mon Jan 15 18:14:45 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Jan 2024 18:14:45 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v4] In-Reply-To: References: Message-ID: <_RKrLr7tgyN-c2vi91Fzri1bzEaSb1yCdZn6CEYdIf4=.dba311f6-80ac-4dae-ab60-faa9d9bb3d09@github.com> > Hi, > Can you review the patch to add ConvHF2F intrinsic to JDK for riscv? > Thanks! > > (By latest kernel patch, `#define RISCV_HWPROBE_EXT_ZFH (1 << 27)` > https://lore.kernel.org/lkml/20231114141256.126749-11-cleger at rivosinc.com/) > > ## Test > ### Functionality > #### hotspot tests > test/hotspot/jtreg/compiler/intrinsics/ > test/hotspot/jtreg/compiler/c2/irTests > > #### jdk tests > test/jdk/java/lang/Float/Binary16Conversion*.java > > ### Performance > tested on licheepi. > > #### with UseZfh enabled & stub out-of-band > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 3493.376 ? 18.631 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 19.819 ? 0.193 ns/op > > > #### with UseZfh enabled only > (i.e. enable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 4659.796 ? 13.262 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 22.957 ? 0.098 ns/op > > > #### with UseZfh disabled > (i.e. disable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 22930.591 ? 72.595 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 25.970 ? 0.063 ns/op Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - merge with master - Fix pipeline cost in ad; Add comments - optimize perf with stub out-of-line - update RISCV_HWPROBE_EXT_ZFH value - Initial commit ------------- Changes: https://git.openjdk.org/jdk/pull/16802/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16802&range=03 Stats: 88 lines in 13 files changed: 88 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16802/head:pull/16802 PR: https://git.openjdk.org/jdk/pull/16802 From mli at openjdk.org Mon Jan 15 18:25:30 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Jan 2024 18:25:30 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v5] In-Reply-To: References: Message-ID: > Hi, > Can you review the patch to add ConvHF2F intrinsic to JDK for riscv? > Thanks! > > (By latest kernel patch, `#define RISCV_HWPROBE_EXT_ZFH (1 << 27)` > https://lore.kernel.org/lkml/20231114141256.126749-11-cleger at rivosinc.com/) > > ## Test > ### Functionality > #### hotspot tests > test/hotspot/jtreg/compiler/intrinsics/ > test/hotspot/jtreg/compiler/c2/irTests > > #### jdk tests > test/jdk/java/lang/Float/Binary16Conversion*.java > > ### Performance > tested on licheepi. > > #### with UseZfh enabled & stub out-of-band > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 3493.376 ? 18.631 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 19.819 ? 0.193 ns/op > > > #### with UseZfh enabled only > (i.e. enable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 4659.796 ? 13.262 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 22.957 ? 0.098 ns/op > > > #### with UseZfh disabled > (i.e. disable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 22930.591 ? 72.595 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 25.970 ? 0.063 ns/op Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - merge master - merge with master - Fix pipeline cost in ad; Add comments - optimize perf with stub out-of-line - update RISCV_HWPROBE_EXT_ZFH value - Initial commit ------------- Changes: https://git.openjdk.org/jdk/pull/16802/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16802&range=04 Stats: 88 lines in 13 files changed: 88 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16802/head:pull/16802 PR: https://git.openjdk.org/jdk/pull/16802 From mli at openjdk.org Mon Jan 15 18:36:32 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Jan 2024 18:36:32 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v6] In-Reply-To: References: Message-ID: <64moZwzgji8A8C_sm3mmEVNx4fGQqGqy_9McsaHMvr4=.7b04bd80-7843-469e-bbee-5d51c0cfbb11@github.com> > Hi, > Can you review the patch to add ConvHF2F intrinsic to JDK for riscv? > Thanks! > > (By latest kernel patch, `#define RISCV_HWPROBE_EXT_ZFH (1 << 27)` > https://lore.kernel.org/lkml/20231114141256.126749-11-cleger at rivosinc.com/) > > ## Test > ### Functionality > #### hotspot tests > test/hotspot/jtreg/compiler/intrinsics/ > test/hotspot/jtreg/compiler/c2/irTests > > #### jdk tests > test/jdk/java/lang/Float/Binary16Conversion*.java > > ### Performance > tested on licheepi. > > #### with UseZfh enabled & stub out-of-band > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 3493.376 ? 18.631 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 19.819 ? 0.193 ns/op > > > #### with UseZfh enabled only > (i.e. enable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 4659.796 ? 13.262 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 22.957 ? 0.098 ns/op > > > #### with UseZfh disabled > (i.e. disable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 22930.591 ? 72.595 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 25.970 ? 0.063 ns/op Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: split hw probe code from the patch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16802/files - new: https://git.openjdk.org/jdk/pull/16802/files/f2b4434a..f35f4e9b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16802&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16802&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16802/head:pull/16802 PR: https://git.openjdk.org/jdk/pull/16802 From mli at openjdk.org Mon Jan 15 18:44:28 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Jan 2024 18:44:28 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v3] In-Reply-To: References: <13Ot4D45ppGcgnXjlGP1xrYEcZ8LejbI5cxjRruUD4c=.4cd4ca6f-8e4f-4679-9706-59a86d867b6f@github.com> Message-ID: On Sat, 30 Dec 2023 17:58:08 GMT, Vladimir Kempik wrote: > But you were testing it on licheepi which has old 5.10 kernel, So perhaps Zfh autodetection can be added later You're right, I split the Zfh probe code from the patch, will add Zfh probe code later in [JDK-8323748](https://bugs.openjdk.org/browse/JDK-8323748) @VladimirKempik @merykitty @RealFYang Thanks for your reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16802#issuecomment-1892631958 PR Comment: https://git.openjdk.org/jdk/pull/16802#issuecomment-1892633510 From mli at openjdk.org Mon Jan 15 18:44:29 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Jan 2024 18:44:29 GMT Subject: Integrated: 8318227: RISC-V: C2 ConvHF2F In-Reply-To: References: Message-ID: On Thu, 23 Nov 2023 17:13:47 GMT, Hamlin Li wrote: > Hi, > Can you review the patch to add ConvHF2F intrinsic to JDK for riscv? > Thanks! > > (By latest kernel patch, `#define RISCV_HWPROBE_EXT_ZFH (1 << 27)` > https://lore.kernel.org/lkml/20231114141256.126749-11-cleger at rivosinc.com/) > > ## Test > ### Functionality > #### hotspot tests > test/hotspot/jtreg/compiler/intrinsics/ > test/hotspot/jtreg/compiler/c2/irTests > > #### jdk tests > test/jdk/java/lang/Float/Binary16Conversion*.java > > ### Performance > tested on licheepi. > > #### with UseZfh enabled & stub out-of-band > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 3493.376 ? 18.631 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 19.819 ? 0.193 ns/op > > > #### with UseZfh enabled only > (i.e. enable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 4659.796 ? 13.262 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 22.957 ? 0.098 ns/op > > > #### with UseZfh disabled > (i.e. disable the intrinsic) > > Benchmark (size) Mode Cnt Score Error Units > Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 22930.591 ? 72.595 ns/op > Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 25.970 ? 0.063 ns/op This pull request has now been integrated. Changeset: b3634722 Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/b3634722655901b8d3e43dd1f8aa2b4487509a34 Stats: 84 lines in 12 files changed: 84 ins; 0 del; 0 mod 8318227: RISC-V: C2 ConvHF2F Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/16802 From eosterlund at openjdk.org Mon Jan 15 20:04:22 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 15 Jan 2024 20:04:22 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: <0jokUqH419N8j2StppufH363EwYPkje2eY65uhWs8Lw=.b36483d3-7384-46b3-b945-f1c8485fff07@github.com> On Mon, 8 Jan 2024 09:47:36 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Inline new_ic_stub Seems like a reasonable band-aid to backport. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17277#pullrequestreview-1822204984 From dholmes at openjdk.org Mon Jan 15 22:41:24 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 15 Jan 2024 22:41:24 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 11:05:09 GMT, Aleksey Shipilev wrote: > Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. > > Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. > > > % make test TEST=all > > Test selection 'all', will run: > * jtreg:test/hotspot/jtreg:all > * jtreg:test/jdk:all > * jtreg:test/langtools:all > * jtreg:test/jaxp:all > * jtreg:test/lib-test:all > > (...about 6 hours later...) > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR >>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>> jtreg:test/jdk:all 9962 9951 11 0 << > jtreg:test/langtools:all 4469 4469 0 0 > jtreg:test/jaxp:all 513 513 0 0 > jtreg:test/lib-test:all 32 32 0 0 > ============================== > TEST FAILURE Okay - change is harmless with no ongoing maintenance cost. test/jdk/TEST.groups line 28: > 26: # > 27: > 28: all = \ Why no `jdk_all` definition in this case? ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17422#pullrequestreview-1822313872 PR Review Comment: https://git.openjdk.org/jdk/pull/17422#discussion_r1452781088 From duke at openjdk.org Tue Jan 16 02:47:44 2024 From: duke at openjdk.org (Yude Lin) Date: Tue, 16 Jan 2024 02:47:44 GMT Subject: RFR: 8323273: AArch64: Strengthen CompressedClassPointers initialization check for base Message-ID: Summary: Add a platform-dependent check for CompressedClassSpaceBaseAddress; Remove the "reserve anywhere" attempt after the initial mapping attempt failed---this is rarely used and will likely fail anyway, because the accepted mapping is very restricted on aarch64; Additional assertions after initialization. Passed hotspot/jtreg/:tier1 on fastdebug ------------- Commit messages: - 8323273: AArch64: Strengthen CompressedClassPointers initialization check for base Changes: https://git.openjdk.org/jdk/pull/17437/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17437&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323273 Stats: 44 lines in 6 files changed: 34 ins; 3 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17437.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17437/head:pull/17437 PR: https://git.openjdk.org/jdk/pull/17437 From jbhateja at openjdk.org Tue Jan 16 06:21:25 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 06:21:25 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 label add hotspot-compiler-dev ------------- PR Comment: https://git.openjdk.org/jdk/pull/16753#issuecomment-1893133426 From fyang at openjdk.org Tue Jan 16 07:14:19 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 16 Jan 2024 07:14:19 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:14:05 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> JMH_PARAMS="-f 1 -wi 10 -i 20" // for every loop of jmh test >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`, more spcifically `TESTS="MessageDigests.digest MessageDigests.getAndDigest MessageDigestBench.digest"` >> >> >> // After >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 1845.446 ? 27.052 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 181455.350 ? 532.258 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2447.674 ? 10.239 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 182896.083 ? 1242.774 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 11599227.792 ? 121442.390 ns/op >> // Before >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 2352.475 ? 11.198 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 188495.684 ? 1467.942 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2437.347 ? 6.398 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 196086.570 ? 1140.998 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 12362160.119 ? 38788.109 ns/op >> >> >> **getAndDigest when size == 64** >> The data is not stable for test getAndDigest when size == 64, which I think is introduced by j.s.MessageDigest.getInstance itself, which we don't touch in this patch. >> Check more details at [1](ht... > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - remove tp/gp > - refine code FYI: The performance numbers seems more stable on other platforms like Unmatched board (JMH AverageTime mode): Before: MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3974.419 ? 28.954 ns/op MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 411073.165 ? 3731.988 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 7136.679 ? 480.850 ns/op MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 429881.929 ? 1265.110 ns/op MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3993.060 ? 6.265 ns/op MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 410724.751 ? 2075.018 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 7085.596 ? 496.358 ns/op MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 430184.356 ? 1052.236 ns/op MessageDigests.digest SHA-1 64 DEFAULT avgt 15 4016.232 ? 48.074 ns/op MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 417735.231 ? 7001.640 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 7114.528 ? 504.775 ns/op MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 438041.321 ? 20056.313 ns/op After: MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3685.514 ? 5.401 ns/op MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 364406.355 ? 217.797 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 5427.864 ? 41.520 ns/op MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 367995.806 ? 228.853 ns/op MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3681.851 ? 6.591 ns/op MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 364433.610 ? 226.146 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 5483.575 ? 46.445 ns/op MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 367713.143 ? 348.944 ns/op MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3686.556 ? 6.273 ns/op MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 364631.822 ? 1265.576 ns/op MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 5496.395 ? 66.473 ns/op MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 367870.983 ? 296.836 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1893178691 From sroy at openjdk.org Tue Jan 16 08:33:39 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 16 Jan 2024 08:33:39 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v8] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: - Fix merge conflicts - Spaces fix - Restore lines - Remove trailing spaces. - Change return type - Change dll load function signature that does dlopen - Remove AIX macros - Add wrapper function to check extension before dlopen - merge pr/16920 - cosmetic changes - ... and 14 more: https://git.openjdk.org/jdk/compare/36f4b34f...6a5ce4a2 ------------- Changes: https://git.openjdk.org/jdk/pull/16604/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=07 Stats: 28 lines in 2 files changed: 27 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From sroy at openjdk.org Tue Jan 16 08:36:49 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 16 Jan 2024 08:36:49 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v9] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with three additional commits since the last revision: - Update porting_aix.cpp - Update porting_aix.cpp - Update os_aix.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/6a5ce4a2..212f16be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=07-08 Stats: 6 lines in 2 files changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From sroy at openjdk.org Tue Jan 16 08:46:27 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 16 Jan 2024 08:46:27 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <34N1MQU3CaBHa22aH-xp8234QcVaRlFZDZ6oeKXZpqo=.9630fc7a-623b-472b-980b-2cf3d0848fc0@github.com> On Wed, 20 Dec 2023 13:29:05 GMT, Joachim Kern wrote: >> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: >> >> Spaces fix > > src/hotspot/os/aix/os_aix.cpp line 1168: > >> 1166: int extension_length = 3; >> 1167: char* file_path = NEW_C_HEAP_ARRAY(char, buffer_length + extension_length + 1, mtInternal); >> 1168: strncpy(file_path,filename, buffer_length + 1); > > Why not using > `char* file_path = os::strdup (filename);` > which would replace lines 1167+1168 > and use the corresponding > `os::free (file_path);` > at the end Ok. Any performance advantage to using that ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1453094259 From shade at openjdk.org Tue Jan 16 08:54:46 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 08:54:46 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v2] In-Reply-To: References: Message-ID: > Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. > > Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. > > > % make test TEST=all > > Test selection 'all', will run: > * jtreg:test/hotspot/jtreg:all > * jtreg:test/jdk:all > * jtreg:test/langtools:all > * jtreg:test/jaxp:all > * jtreg:test/lib-test:all > > (...about 6 hours later...) > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR >>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>> jtreg:test/jdk:all 9962 9951 11 0 << > jtreg:test/langtools:all 4469 4469 0 0 > jtreg:test/jaxp:all 513 513 0 0 > jtreg:test/lib-test:all 32 32 0 0 > ============================== > TEST FAILURE Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: jdk_all and lib_test_all groups ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17422/files - new: https://git.openjdk.org/jdk/pull/17422/files/7f6797b6..78f5f9bd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17422&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17422&range=00-01 Stats: 8 lines in 2 files changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17422.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17422/head:pull/17422 PR: https://git.openjdk.org/jdk/pull/17422 From shade at openjdk.org Tue Jan 16 08:54:49 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 08:54:49 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v2] In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 22:37:36 GMT, David Holmes wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> jdk_all and lib_test_all groups > > test/jdk/TEST.groups line 28: > >> 26: # >> 27: >> 28: all = \ > > Why no `jdk_all` definition in this case? Tried not to introduce new `*_all` groups here. `jdk_all` would be the same as `jdk:all`, TBH. But we still can do it for symmetry reasons, see new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17422#discussion_r1453098855 From alanb at openjdk.org Tue Jan 16 08:54:49 2024 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 16 Jan 2024 08:54:49 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v2] In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 08:47:38 GMT, Aleksey Shipilev wrote: >> test/jdk/TEST.groups line 28: >> >>> 26: # >>> 27: >>> 28: all = \ >> >> Why no `jdk_all` definition in this case? > > Tried not to introduce new `*_all` groups here. `jdk_all` would be the same as `jdk:all`, TBH. But we still can do it for symmetry reasons, see new commit. "all" looks okay but the comment "Catch-all" suggests something else, shouldn't be "All tests"? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17422#discussion_r1453103766 From shade at openjdk.org Tue Jan 16 09:01:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 09:01:35 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: References: Message-ID: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> > Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. > > Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. > > > % make test TEST=all > > Test selection 'all', will run: > * jtreg:test/hotspot/jtreg:all > * jtreg:test/jdk:all > * jtreg:test/langtools:all > * jtreg:test/jaxp:all > * jtreg:test/lib-test:all > > (...about 6 hours later...) > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR >>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>> jtreg:test/jdk:all 9962 9951 11 0 << > jtreg:test/langtools:all 4469 4469 0 0 > jtreg:test/jaxp:all 513 513 0 0 > jtreg:test/lib-test:all 32 32 0 0 > ============================== > TEST FAILURE Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Catch-all -> All tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17422/files - new: https://git.openjdk.org/jdk/pull/17422/files/78f5f9bd..def2f39b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17422&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17422&range=01-02 Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17422.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17422/head:pull/17422 PR: https://git.openjdk.org/jdk/pull/17422 From shade at openjdk.org Tue Jan 16 09:01:36 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 09:01:36 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 08:52:03 GMT, Alan Bateman wrote: >> Tried not to introduce new `*_all` groups here. `jdk_all` would be the same as `jdk:all`, TBH. But we still can do it for symmetry reasons, see new commit. > > "all" looks okay but the comment "Catch-all" suggests something else, shouldn't be "All tests"? Yeah, we can do "All tests" instead. See new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17422#discussion_r1453113607 From mdoerr at openjdk.org Tue Jan 16 09:26:23 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 16 Jan 2024 09:26:23 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: On Mon, 8 Jan 2024 09:47:36 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Inline new_ic_stub Test results are good. Thumbs up from my side! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17277#pullrequestreview-1822876189 From alanb at openjdk.org Tue Jan 16 09:39:19 2024 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 16 Jan 2024 09:39:19 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17422#pullrequestreview-1822902993 From aph at openjdk.org Tue Jan 16 09:47:22 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Jan 2024 09:47:22 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> Message-ID: <-ZJVFNiG3peSqCUN3dfzBgywiQ6TYtn0IycqP41nOZU=.a5f8bd14-b658-451e-998b-7c592537c589@github.com> On Mon, 8 Jan 2024 09:47:36 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Inline new_ic_stub src/hotspot/share/code/icBuffer.hpp line 69: > 67: // cache coherency on some architectures to order the updates to ICStub and setting > 68: // the destination to the ICStub. Note that cache lines size might be larger than > 69: // CodeEntryAlignment that is a normal alignment for CodeBlobs. This is a paragraph that raises more questions than it answers. The relationship between ordering and instruction cache line coherency is pretty tenuous even on x86 these days, isn't it? I'd not say "For extra correctness/safety," but "To be cautious,". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17277#discussion_r1453172823 From shade at openjdk.org Tue Jan 16 09:56:46 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 09:56:46 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v3] In-Reply-To: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: > This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. > > Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. > > Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: > - ARM32: 32 -> 64 bytes :( > - AArch64: 128 -> 64 bytes :) > - x86_64: 64 -> 64 bytes :| > - x86_32: 32 -> 64 bytes :( > - PPC64: 512 -> 128 bytes :)) > - RISC-V: 128 -> 64 bytes :) > - S390X: 128 -> 256 bytes :( > - Zero: > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Cautious - Merge branch 'master' into JDK-8321137-reconsider-icstub-align - Inline new_ic_stub - Work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17277/files - new: https://git.openjdk.org/jdk/pull/17277/files/b0875b50..d799cbb0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17277&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17277&range=01-02 Stats: 32555 lines in 760 files changed: 19477 ins; 9568 del; 3510 mod Patch: https://git.openjdk.org/jdk/pull/17277.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17277/head:pull/17277 PR: https://git.openjdk.org/jdk/pull/17277 From shade at openjdk.org Tue Jan 16 09:56:49 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 09:56:49 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v2] In-Reply-To: <-ZJVFNiG3peSqCUN3dfzBgywiQ6TYtn0IycqP41nOZU=.a5f8bd14-b658-451e-998b-7c592537c589@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> <4gnEP4XoKdgxnTrNXnfi9v9lVxCyp6GWy8-vf3-j29A=.2cf25147-1d36-443c-b70a-3a05fe6ddcca@github.com> <-ZJVFNiG3peSqCUN3dfzBgywiQ6TYtn0IycqP41nOZU=.a5f8bd14-b658-451e-998b-7c592537c589@github.com> Message-ID: <7jdYLOdsXtNouaFbc-x00cJNgkwutpJo5aY5fMj6C7U=.50e0fd08-1c1d-401e-8d45-a64b8cc769e0@github.com> On Tue, 16 Jan 2024 09:44:09 GMT, Andrew Haley wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Inline new_ic_stub > > src/hotspot/share/code/icBuffer.hpp line 69: > >> 67: // cache coherency on some architectures to order the updates to ICStub and setting >> 68: // the destination to the ICStub. Note that cache lines size might be larger than >> 69: // CodeEntryAlignment that is a normal alignment for CodeBlobs. > > This is a paragraph that raises more questions than it answers. The relationship between ordering and instruction cache line coherency is pretty tenuous even on x86 these days, isn't it? I'd not say "For extra correctness/safety," but "To be cautious,". Sure. How does the new comment look? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17277#discussion_r1453184007 From fbredberg at openjdk.org Tue Jan 16 10:03:29 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 16 Jan 2024 10:03:29 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction Message-ID: The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. ------------- Commit messages: - 8322535: Change default AArch64 SpinPause instruction Changes: https://git.openjdk.org/jdk/pull/17430/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17430&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322535 Stats: 20 lines in 2 files changed: 17 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17430.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17430/head:pull/17430 PR: https://git.openjdk.org/jdk/pull/17430 From ayang at openjdk.org Tue Jan 16 10:06:24 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 16 Jan 2024 10:06:24 GMT Subject: [jdk22] RFR: 8322987: Remove gc/stress/gclocker/TestGCLocker* since they always fail with OOME In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 14:55:00 GMT, Thomas Schatzl wrote: > Hi all, > > please review this backport of [JDK-8322987](https://bugs.openjdk.org/browse/JDK-8322987) and [JDK-8323508](https://bugs.openjdk.org/browse/JDK-8323508) to jdk22. > > The second CR is a bugfix for the first one, and I did not want to risk of CI failures because of pushing them separately. > > Thanks, > Thomas Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/54#pullrequestreview-1822960785 From shade at openjdk.org Tue Jan 16 10:10:17 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 Jan 2024 10:10:17 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: <_rFHZuBK9QoPk8x-ByZoSjXRAwqJkauaI9M_DlKstxE=.da51dee4-16ba-40ae-9766-4370aab4fe3e@github.com> On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. Attn @eastig. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1893429607 From aph at openjdk.org Tue Jan 16 10:21:19 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Jan 2024 10:21:19 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. ISB isn't really the right thing for this. Sure, it causes a delay, but the extent of the delay depends on what else the processor is doing. In some cases an ISB can work well, in other cases not. Some micro benchmarks show a great improvement with ISB. It doesn't depend only on the target hardware, but on the application. Sure, in some cases an ISB is going to be exactly right, but on others it might be too much. ------------- PR Review: https://git.openjdk.org/jdk/pull/17430#pullrequestreview-1822989370 From jbhateja at openjdk.org Tue Jan 16 10:26:28 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 10:26:28 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/share/opto/library_call.cpp line 1228: > 1226: result = _gvn.transform(new ProjNode(call, TypeFunc::Parms)); > 1227: } else { > 1228: result = make_indexOf_node(src_start, src_count, tgt_start, tgt_count, Existing routines emits IR to handle following special cases. tgt_cnt > src_cnt return -1 tgt_cnt == 0 return 0. Should we not be preserving those check before calling stub ? As of now these checks are part of stub and doing them in JIT code will save call overhead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453223658 From stefank at openjdk.org Tue Jan 16 10:33:28 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 Jan 2024 10:33:28 GMT Subject: Integrated: 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:16:12 GMT, Stefan Karlsson wrote: > Don't print the ZGC Phase Switch hs_err section when the JVM is run with other GCs than ZGC. > > I've tested this manually with `-XX:ErrorHandlerTest=3 -version` and verified that the section is logged when ZGC is used, and that it is not logged when G1 is used. This pull request has now been integrated. Changeset: 59062402 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/59062402b9c5ed5612a13c1c40eb22cf1b97c41a Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC Reviewed-by: aboldtch, tschatzl, shade ------------- PR: https://git.openjdk.org/jdk/pull/17420 From stefank at openjdk.org Tue Jan 16 10:33:28 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 Jan 2024 10:33:28 GMT Subject: RFR: 8323716: Only print ZGC Phase Switch events in hs_err files when running with ZGC In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:16:12 GMT, Stefan Karlsson wrote: > Don't print the ZGC Phase Switch hs_err section when the JVM is run with other GCs than ZGC. > > I've tested this manually with `-XX:ErrorHandlerTest=3 -version` and verified that the section is logged when ZGC is used, and that it is not logged when G1 is used. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17420#issuecomment-1893465200 From jkern at openjdk.org Tue Jan 16 10:48:32 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 16 Jan 2024 10:48:32 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: <34N1MQU3CaBHa22aH-xp8234QcVaRlFZDZ6oeKXZpqo=.9630fc7a-623b-472b-980b-2cf3d0848fc0@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <34N1MQU3CaBHa22aH-xp8234QcVaRlFZDZ6oeKXZpqo=.9630fc7a-623b-472b-980b-2cf3d0848fc0@github.com> Message-ID: On Tue, 16 Jan 2024 08:43:34 GMT, Suchismith Roy wrote: >> src/hotspot/os/aix/os_aix.cpp line 1168: >> >>> 1166: int extension_length = 3; >>> 1167: char* file_path = NEW_C_HEAP_ARRAY(char, buffer_length + extension_length + 1, mtInternal); >>> 1168: strncpy(file_path,filename, buffer_length + 1); >> >> Why not using >> `char* file_path = os::strdup (filename);` >> which would replace lines 1167+1168 >> and use the corresponding >> `os::free (file_path);` >> at the end > > Ok. Any performance advantage to using that ? No, I do not believe that it has performance advantage, but I think it is simpler to understand. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1453249951 From sjohanss at openjdk.org Tue Jan 16 11:00:21 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 16 Jan 2024 11:00:21 GMT Subject: [jdk22] RFR: 8322987: Remove gc/stress/gclocker/TestGCLocker* since they always fail with OOME In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 14:55:00 GMT, Thomas Schatzl wrote: > Hi all, > > please review this backport of [JDK-8322987](https://bugs.openjdk.org/browse/JDK-8322987) and [JDK-8323508](https://bugs.openjdk.org/browse/JDK-8323508) to jdk22. > > The second CR is a bugfix for the first one, and I did not want to risk of CI failures because of pushing them separately. > > Thanks, > Thomas Marked as reviewed by sjohanss (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/54#pullrequestreview-1823065188 From stefank at openjdk.org Tue Jan 16 11:20:28 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 Jan 2024 11:20:28 GMT Subject: [jdk22] RFR: 8322957: Generational ZGC: Relocation selection must join the STS In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 09:57:54 GMT, Stefan Karlsson wrote: > Hi all, > > This pull request contains a backport of commit [ba23025c](https://github.com/openjdk/jdk/commit/ba23025cd8a9c1af37afea6444ce5ea2ff41e5af) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Stefan Karlsson on 12 Jan 2024 and was reviewed by Erik ?sterlund and Axel Boldt-Christmas. > > Thanks! Thanks for the review! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/74#issuecomment-1893539948 From stefank at openjdk.org Tue Jan 16 11:20:29 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 Jan 2024 11:20:29 GMT Subject: [jdk22] Integrated: 8322957: Generational ZGC: Relocation selection must join the STS In-Reply-To: References: Message-ID: <4wvWwTvawS3-RKLC4kUfRLTe9cpwasOJgqBtcYrJtMc=.2de1a783-2f89-4cfc-9558-d7137026af8a@github.com> On Mon, 15 Jan 2024 09:57:54 GMT, Stefan Karlsson wrote: > Hi all, > > This pull request contains a backport of commit [ba23025c](https://github.com/openjdk/jdk/commit/ba23025cd8a9c1af37afea6444ce5ea2ff41e5af) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Stefan Karlsson on 12 Jan 2024 and was reviewed by Erik ?sterlund and Axel Boldt-Christmas. > > Thanks! This pull request has now been integrated. Changeset: a91569dd Author: Stefan Karlsson URL: https://git.openjdk.org/jdk22/commit/a91569dd20ab7dd0c30d6693b94210994500d8cd Stats: 168 lines in 14 files changed: 127 ins; 20 del; 21 mod 8322957: Generational ZGC: Relocation selection must join the STS Reviewed-by: aboldtch Backport-of: ba23025cd8a9c1af37afea6444ce5ea2ff41e5af ------------- PR: https://git.openjdk.org/jdk22/pull/74 From smonteith at openjdk.org Tue Jan 16 11:23:19 2024 From: smonteith at openjdk.org (Stuart Monteith) Date: Tue, 16 Jan 2024 11:23:19 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. For the most part, "YIELD" is probably going to be equivalent to a "NOP". Unless there is a a demonstrable reason for this change, I would leave it as it is. With regards to the change, do you have a suite of benchmark data that demonstrates this is a benefit on Apple Silicon? Otherwise, as @theRealAph says, microbenchmarks can demonstrate an benefit from ISBs, but applications overall won't necessarily show any benefit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1893545918 From jbhateja at openjdk.org Tue Jan 16 11:30:26 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 11:30:26 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1528: > 1526: #endif > 1527: > 1528: __ subptr(rsp, 0xf0); Can we spill them into XXMs, to save costly stack operations. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1544: > 1542: // if (k == 0) { > 1543: // return 0; > 1544: // } Kindly use meaningful variable and label names. It will ease the review process and maintenance. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1545: > 1543: // return 0; > 1544: // } > 1545: __ movq(r12, rcx); Check for K == 0 should use rsi. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1551: > 1549: __ movq(r15, rsi); > 1550: __ movq(r11, rdi); > 1551: __ cmpq(rsi, 0x20); All comparisons are with 32 bit int value , cmpq -> cmpl, may save emitting REX encoding prefix (no need for setting REX.W). src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1552: > 1550: __ movq(r11, rdi); > 1551: __ cmpq(rsi, 0x20); > 1552: __ jb(L_small_string); All the comparisons against needled / haystack lengths are signed integer comparisons, so jb should be replaced by jl ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453226797 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453227987 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453245805 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453250207 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453294109 From tschatzl at openjdk.org Tue Jan 16 11:35:27 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 16 Jan 2024 11:35:27 GMT Subject: [jdk22] RFR: 8322987: Remove gc/stress/gclocker/TestGCLocker* since they always fail with OOME In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 10:03:57 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this backport of [JDK-8322987](https://bugs.openjdk.org/browse/JDK-8322987) and [JDK-8323508](https://bugs.openjdk.org/browse/JDK-8323508) to jdk22. >> >> The second CR is a bugfix for the first one, and I did not want to risk of CI failures because of pushing them separately. >> >> Thanks, >> Thomas > > Marked as reviewed by ayang (Reviewer). Thanks @albertnetymk @kstefanj for your reviews ------------- PR Comment: https://git.openjdk.org/jdk22/pull/54#issuecomment-1893563178 From tschatzl at openjdk.org Tue Jan 16 11:35:29 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 16 Jan 2024 11:35:29 GMT Subject: [jdk22] Integrated: 8322987: Remove gc/stress/gclocker/TestGCLocker* since they always fail with OOME In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 14:55:00 GMT, Thomas Schatzl wrote: > Hi all, > > please review this backport of [JDK-8322987](https://bugs.openjdk.org/browse/JDK-8322987) and [JDK-8323508](https://bugs.openjdk.org/browse/JDK-8323508) to jdk22. > > The second CR is a bugfix for the first one, and I did not want to risk of CI failures because of pushing them separately. > > Thanks, > Thomas This pull request has now been integrated. Changeset: 4034787c Author: Thomas Schatzl URL: https://git.openjdk.org/jdk22/commit/4034787ccbb90ae66cc945d9868f2c186d14af14 Stats: 449 lines in 8 files changed: 0 ins; 449 del; 0 mod 8322987: Remove gc/stress/gclocker/TestGCLocker* since they always fail with OOME 8323508: Remove TestGCLockerWithShenandoah.java line from TEST.groups Reviewed-by: ayang, sjohanss ------------- PR: https://git.openjdk.org/jdk22/pull/54 From jbhateja at openjdk.org Tue Jan 16 12:04:21 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 12:04:21 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 label /add hotspot-compiler-dev ------------- PR Comment: https://git.openjdk.org/jdk/pull/16753#issuecomment-1893605792 From jbhateja at openjdk.org Tue Jan 16 12:12:34 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 12:12:34 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 197: > 195: __ bind(L_small_string); > 196: __ cmpq(r15, 0x20); > 197: __ ja(L_small_string2); ja should replaced by jg. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1526: > 1524: __ movq(rdx, r8); > 1525: __ movq(rcx, r9); > 1526: #endif Can we spill them into XXMs, to save costly stack operations. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1545: > 1543: // return 0; > 1544: // } > 1545: __ movq(r12, rcx); Kindly use meaningful variable and label names. It will ease the review process and maintenance. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1551: > 1549: __ movq(r15, rsi); > 1550: __ movq(r11, rdi); > 1551: __ cmpq(rsi, 0x20); All comparisons are with 32 bit int value , cmpq -> cmpl, may save emitting REX encoding prefix (no need for setting REX.W). src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1552: > 1550: __ movq(r11, rdi); > 1551: __ cmpq(rsi, 0x20); > 1552: __ jb(L_small_string); All the comparisons against needle length are signed integer comparisons, so jb should be replaced by jl src/hotspot/share/opto/library_call.cpp line 1206: > 1204: > 1205: Node* result = nullptr; > 1206: bool do_intrinsic = Name change suggestion: do_intrinsic -> call_opt_stub src/hotspot/share/opto/library_call.cpp line 1229: > 1227: } else { > 1228: result = make_indexOf_node(src_start, src_count, tgt_start, tgt_count, > 1229: result_rgn, result_phi, ae); Existing routines emits IR to handle following special cases. tgt_cnt > src_cnt return -1 tgt_cnt == 0 return 0. Should we not be preserving those check before calling stub ? As of now these checks are part of stub and doing them in JIT code will save call overhead. src/hotspot/share/opto/runtime.cpp line 1347: > 1345: fields[argp++] = TypeInt::INT; // needle length > 1346: fields[argp++] = TypePtr::NOTNULL; // haystack array > 1347: fields[argp++] = TypeInt::INT; // haystack length Do we need to swap the comments? first two arguments corresponds to value (haystack) as per java side intrinsic signature. https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/StringLatin1.java#L348 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453304911 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453332647 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453333045 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453333555 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453333878 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453338427 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453338718 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453329079 From jbhateja at openjdk.org Tue Jan 16 13:29:24 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 13:29:24 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: <0XxCusssrDiiKzXBfdsY1XHkv9T6mJwJe7dwFz5Uy-I=.3325e496-5bf1-4a79-8969-e28e018b77db@github.com> On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 417: > 415: __ cmpl(Address(rbx, r15, Address::times_1, -0x14), rax); > 416: __ jne(L_top_loop_1); > 417: __ jmp(L_0x406019); For cases which are multiple of 4 bytes we can use VMASKMOVPS (conditional moves) and VPTEST. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453425855 From jbhateja at openjdk.org Tue Jan 16 13:32:25 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Jan 2024 13:32:25 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 470: > 468: __ jne(L_top_loop_1); > 469: __ jmp(L_0x406019); > 470: For 16 bytes we can directly use [V]PTEST instruction to save multiple loads and compares. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1453429803 From eastigeevich at openjdk.org Tue Jan 16 14:01:21 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 14:01:21 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 10:18:26 GMT, Andrew Haley wrote: > ISB isn't really the right thing for this. Sure, it causes a delay, but the extent of the delay depends on what else the processor is doing. In some cases an ISB can work well, in other cases not. Some micro benchmarks show a great improvement with ISB. It doesn't depend only on the target hardware, but on the application. Sure, in some cases an ISB is going to be exactly right, but on others it might be too much. BTW, In Armv8.7-A/Armv9.2-A we have WFE/WFI with timeouts which is supported by Cortex-X4, A720 and A520. Available implementations of them are MediaTek Dimensity 9300 and Qualcomm Snapdragon 8 Gen 3. It would be possible to benchmark WFE-based implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1893795630 From eastigeevich at openjdk.org Tue Jan 16 14:28:18 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 14:28:18 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. lgtm ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/17430#pullrequestreview-1823563625 From eastigeevich at openjdk.org Tue Jan 16 14:28:19 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 14:28:19 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 10:18:26 GMT, Andrew Haley wrote: >> The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". >> >> However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. >> >> This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". >> >> Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. > > ISB isn't really the right thing for this. Sure, it causes a delay, but the extent of the delay depends on what else the processor is doing. In some cases an ISB can work well, in other cases not. Some micro benchmarks show a great improvement with ISB. > It doesn't depend only on the target hardware, but on the application. Sure, in some cases an ISB is going to be exactly right, but on others it might be too much. > For the most part, "YIELD" is probably going to be equivalent to a "NOP". Unless there is a a demonstrable reason for this change, I would leave it as it is. With regards to the change, do you have a suite of benchmark data that demonstrates this is a benefit on Apple Silicon? Otherwise, as @theRealAph says, microbenchmarks can demonstrate an benefit from ISBs, but applications overall won't necessarily show any benefit. I agree it would be interesting to see whether desktop applications get any improvements from ISB. We observed good performance improvements in our customers' cloud applications. BTW, ISB gets spread: https://github.com/DLTcollab/sse2neon/blob/master/sse2neon.h#L4812C14-L4812C14 https://github.com/rust-lang/rust/commit/c064b6560b7ce0adeb9bbf5d7dcf12b1acb0c807 https://github.com/simd-everywhere/simde/blob/14311d60539303ca8bad6204dcbc6a29f51b0e09/simde/x86/sse2.h#L4770 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1893851938 From aph at openjdk.org Tue Jan 16 14:57:22 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Jan 2024 14:57:22 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 10:18:26 GMT, Andrew Haley wrote: >> The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". >> >> However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. >> >> This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". >> >> Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. > > ISB isn't really the right thing for this. Sure, it causes a delay, but the extent of the delay depends on what else the processor is doing. In some cases an ISB can work well, in other cases not. Some micro benchmarks show a great improvement with ISB. > It doesn't depend only on the target hardware, but on the application. Sure, in some cases an ISB is going to be exactly right, but on others it might be too much. > > For the most part, "YIELD" is probably going to be equivalent to a "NOP". Unless there is a a demonstrable reason for this change, I would leave it as it is. With regards to the change, do you have a suite of benchmark data that demonstrates this is a benefit on Apple Silicon? Otherwise, as @theRealAph says, microbenchmarks can demonstrate an benefit from ISBs, but applications overall won't necessarily show any benefit. > > I agree it would be interesting to see whether desktop applications get any improvements from ISB. We observed good performance improvements in our customers' cloud applications. Your customers are running cloud apps on Apple M1/M2? > BTW, ISB gets spread: https://github.com/DLTcollab/sse2neon/blob/master/sse2neon.h#L4812C14-L4812C14 [rust-lang/rust at c064b65](https://github.com/rust-lang/rust/commit/c064b6560b7ce0adeb9bbf5d7dcf12b1acb0c807) https://github.com/simd-everywhere/simde/blob/14311d60539303ca8bad6204dcbc6a29f51b0e09/simde/x86/sse2.h#L4770 Yeah, I know. What I don't know is how much of a cargo cult this is. Apple M1 etc. have very large reorder buffers, and serializing all instructions may not be the best plan. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1893909840 From thartmann at openjdk.org Tue Jan 16 15:35:26 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 16 Jan 2024 15:35:26 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v7] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 18:14:00 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix broken build. Let's wait with this until [JDK-8287061](https://bugs.openjdk.org/browse/JDK-8287061) is stable. We just found another issue [JDK-8322854](https://bugs.openjdk.org/browse/JDK-8322854). ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-1893985816 From eastigeevich at openjdk.org Tue Jan 16 15:53:20 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 15:53:20 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 14:54:42 GMT, Andrew Haley wrote: > Your customers are running cloud apps on Apple M1/M2? In theory they could. M1, M2 and M2 Pro instances are available in cloud. However I am not aware any such cases. > Yeah, I know. What I don't know is how much of a cargo cult this is. Apple M1 etc. have very large reorder buffers, and serializing all instructions may not be the best plan. I hope hardware engineers will notice this improper uses of ISB and will either implement YIELD or something equivalent to it. YIELD could be an alias of a new instruction. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1894020670 From mdoerr at openjdk.org Tue Jan 16 16:15:25 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 16 Jan 2024 16:15:25 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Fri, 5 Jan 2024 12:10:59 GMT, Martin Doerr wrote: > I have tried to build jextract (https://github.com/openjdk/jextract/tree/jdk22) with LLVM (https://github.com/llvm/llvm-project/releases/download/llvmorg-16.0.4/clang+llvm-16.0.4-powerpc64-ibm-aix-7.2.tar.xz). I noticed that llvm mainly consists of .a files. So, I think we need to support that for FFI compatibility with other libraries and open source projects. Seems like this change is not sufficient for that. `clang` is compiled to `libclang.a` on AIX, but `libclang.so` on linux. I'm getting "System error: Exec format error" when trying to load `libclang.a` via `System.loadLibrary(libName);`. So the question remains: Are .a files really supposed to be dynamically loadable on AIX? If so, where is that documented? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1894060171 From aph at openjdk.org Tue Jan 16 16:54:20 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Jan 2024 16:54:20 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 15:50:56 GMT, Evgeny Astigeevich wrote: > > Your customers are running cloud apps on Apple M1/M2? > > In theory they could. M1, M2 and M2 Pro instances are available in cloud. However I am not aware any such cases. Right. > > Yeah, I know. What I don't know is how much of a cargo cult this is. Apple M1 etc. have very large reorder buffers, and serializing all instructions may not be the best plan. > > I hope hardware engineers will notice this improper uses of ISB and will either implement YIELD or something equivalent to it. YIELD could be an alias of a new instruction. I think the problem is that the right amount of time to spin for is application dependent. It also depends on things like the way the memory coherence system works, which is architecturally very different on Apple designs. We could do something less violent than ISB for SpinPause. We could execute a bunch of UDIV instructions with a loop-carried dependency, or cycle an xor-shift generator. That could be made to delay for any number of clock cycles, so we can delay without the side effects of an ISB. We could try to measure how many cycles ISB takes in the "good" cases and design a delay that takes as long as an ISB without disrupting everything else. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1894131403 From henryjen at openjdk.org Tue Jan 16 17:46:21 2024 From: henryjen at openjdk.org (Henry Jen) Date: Tue, 16 Jan 2024 17:46:21 GMT Subject: RFR: Merge bf7bd9a16c172bcb5ea6b24717a0429e12e2e3d1 Message-ID: CPU24_01 fixes. ------------- Commit messages: - 8317547: Enhance TLS connection support - 8314307: Improve loop handling - 8318588: Windows build failure after JDK-8314468 due to ambiguous call - 8314468: Improve Compiler loops - 8317331: Solaris build failed with "declaration can not follow a statement (E_DECLARATION_IN_CODE)" - 8314295: Enhance verification of verifier - 8308204: Enhanced certificate processing The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/jdk/pull/17448/files Stats: 736 lines in 16 files changed: 476 ins; 65 del; 195 mod Patch: https://git.openjdk.org/jdk/pull/17448.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17448/head:pull/17448 PR: https://git.openjdk.org/jdk/pull/17448 From eastigeevich at openjdk.org Tue Jan 16 17:57:22 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 17:57:22 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Sat, 6 Jan 2024 14:05:33 GMT, Boris Ulasevich wrote: >> The change simplifies the CodeCache::initialize_heaps segment memory split logic while preserving the existing layout: >> >> if (!non_nmethod_set && !profiled_set && !non_profiled_set) { >> ... >> } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { >> if (non_profiled_set) { >> if (!profiled_set) { >> ... >> } >> } else if (profiled_set) { >> ... >> } else if (non_nmethod_set) { >> ... >> } >> } >> >> --> >> >> if (!profiled.set && !non_profiled.set) { >> .. >> } >> if (profiled.set && !non_profiled.set) { >> .. >> } >> if (!profiled.set && non_profiled.set) { >> .. >> } >> if (!non_nmethod.set && profiled.set && non_profiled.set) { >> .. >> } >> >> >> With this change, PrintFlagsFinal shows the actual segment sizes (not an intermediate value before alignment), and the segments completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > cleanup & test udpdate src/hotspot/share/code/codeCache.hpp line 469: > 467: typedef CodeBlobIterator AllCodeBlobsIterator; > 468: > 469: struct CodeCacheSegment { Would this create a confusion? `CodeHeap` consists of blocks and names them `segments`. See `src/hotspot/share/memory/heap.hpp` and `src/hotspot/share/memory/heap.cpp`. There is `CodeCache::allocated_segments()` which returns the total number of segments (memory blocks) code heaps use. Also there is `CodeCacheSegmentSize` which defines the size of a memory block for a code heap. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1453776864 From henryjen at openjdk.org Tue Jan 16 18:02:26 2024 From: henryjen at openjdk.org (Henry Jen) Date: Tue, 16 Jan 2024 18:02:26 GMT Subject: [jdk22] RFR: Merge c7f1c97312f94b6dd6398a5e98dd0c8b63db4c9b Message-ID: CPU24_01 fixes. ------------- Commit messages: - 8317547: Enhance TLS connection support - 8314307: Improve loop handling - 8318588: Windows build failure after JDK-8314468 due to ambiguous call - 8314468: Improve Compiler loops - 8317331: Solaris build failed with "declaration can not follow a statement (E_DECLARATION_IN_CODE)" - 8314295: Enhance verification of verifier - 8308204: Enhanced certificate processing The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/jdk22/pull/83/files Stats: 736 lines in 16 files changed: 476 ins; 65 del; 195 mod Patch: https://git.openjdk.org/jdk22/pull/83.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/83/head:pull/83 PR: https://git.openjdk.org/jdk22/pull/83 From iklam at openjdk.org Tue Jan 16 18:27:23 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 16 Jan 2024 18:27:23 GMT Subject: [jdk22] RFR: 8323243: JNI invocation of an abstract instance method corrupts the stack In-Reply-To: References: Message-ID: On Sun, 14 Jan 2024 22:12:17 GMT, David Holmes wrote: > Hi all, > > This pull request contains a backport of commit [71d9a83d](https://github.com/openjdk/jdk/commit/71d9a83dece7eb4bdb6ffdd9caf14a1348045ce0) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by David Holmes on 14 Jan 2024 and was reviewed by Coleen Phillimore and Aleksey Shipilev. > > Thanks! Approving it as the backport is clean. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk22/pull/73#pullrequestreview-1824562745 From eastigeevich at openjdk.org Tue Jan 16 18:27:23 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 18:27:23 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Sat, 6 Jan 2024 14:05:33 GMT, Boris Ulasevich wrote: >> The change simplifies the CodeCache::initialize_heaps segment memory split logic while preserving the existing layout: >> >> if (!non_nmethod_set && !profiled_set && !non_profiled_set) { >> ... >> } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { >> if (non_profiled_set) { >> if (!profiled_set) { >> ... >> } >> } else if (profiled_set) { >> ... >> } else if (non_nmethod_set) { >> ... >> } >> } >> >> --> >> >> if (!profiled.set && !non_profiled.set) { >> .. >> } >> if (profiled.set && !non_profiled.set) { >> .. >> } >> if (!profiled.set && non_profiled.set) { >> .. >> } >> if (!non_nmethod.set && profiled.set && non_profiled.set) { >> .. >> } >> >> >> With this change, PrintFlagsFinal shows the actual segment sizes (not an intermediate value before alignment), and the segments completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > cleanup & test udpdate src/hotspot/share/code/codeCache.hpp line 120: > 118: return (cache_size > known_segments_size + min_size) ? (cache_size - known_segments_size) : min_size; > 119: } > 120: Do we need them in `CodeCache` class and in the hpp file? Why not to have them as static functions in the cpp file? In such a case there will to expose the `CodeCacheSegment` struct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1453828217 From eastigeevich at openjdk.org Tue Jan 16 18:39:22 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 18:39:22 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Tue, 16 Jan 2024 17:55:01 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> cleanup & test udpdate > > src/hotspot/share/code/codeCache.hpp line 469: > >> 467: typedef CodeBlobIterator AllCodeBlobsIterator; >> 468: >> 469: struct CodeCacheSegment { > > Would this create a confusion? > `CodeHeap` consists of blocks and names them `segments`. See `src/hotspot/share/memory/heap.hpp` and `src/hotspot/share/memory/heap.cpp`. > There is `CodeCache::allocated_segments()` which returns the total number of segments (memory blocks) code heaps use. Also there is `CodeCacheSegmentSize` which defines the size of a memory block for a code heap. Maybe `CodeHeapInfo` would be better? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1453843195 From mli at openjdk.org Tue Jan 16 18:51:23 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 16 Jan 2024 18:51:23 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 07:10:23 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - remove tp/gp >> - refine code > > FYI: The performance numbers seems more stable on other platforms like Unmatched board (JMH AverageTime mode): > > > Before: > MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3974.419 ? 28.954 ns/op > MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 411073.165 ? 3731.988 ns/op > MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 7136.679 ? 480.850 ns/op > MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 429881.929 ? 1265.110 ns/op > > MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3993.060 ? 6.265 ns/op > MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 410724.751 ? 2075.018 ns/op > MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 7085.596 ? 496.358 ns/op > MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 430184.356 ? 1052.236 ns/op > > MessageDigests.digest SHA-1 64 DEFAULT avgt 15 4016.232 ? 48.074 ns/op > MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 417735.231 ? 7001.640 ns/op > MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 7114.528 ? 504.775 ns/op > MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 438041.321 ? 20056.313 ns/op > > After: > MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3685.514 ? 5.401 ns/op > MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 364406.355 ? 217.797 ns/op > MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 5427.864 ? 41.520 ns/op > MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 367995.806 ? 228.853 ns/op > > MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3681.851 ? 6.591 ns/op > MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 364433.610 ? 226.146 ns/op > MessageDigests.getAndDigest SHA-1 64 DEFAULT avgt 15 5483.575 ? 46.445 ns/op > MessageDigests.getAndDigest SHA-1 16384 DEFAULT avgt 15 367713.143 ? 348.944 ns/op > > MessageDigests.digest SHA-1 64 DEFAULT avgt 15 3686.556 ? 6.273 ns/op > MessageDigests.digest SHA-1 16384 DEFAULT avgt 15 ... @RealFYang Thanks for testing. Yes, also found that on VF 2 test result is more stable than Lichee Pi 4A. Not sure if we need to further investigate the performance jitter on Lichee Pi 4A, seems it's not related to this implementation. How do yo think about it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1894324589 From eastigeevich at openjdk.org Tue Jan 16 18:53:21 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 16 Jan 2024 18:53:21 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Sat, 6 Jan 2024 14:05:33 GMT, Boris Ulasevich wrote: >> The change simplifies the CodeCache::initialize_heaps segment memory split logic while preserving the existing layout: >> >> if (!non_nmethod_set && !profiled_set && !non_profiled_set) { >> ... >> } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { >> if (non_profiled_set) { >> if (!profiled_set) { >> ... >> } >> } else if (profiled_set) { >> ... >> } else if (non_nmethod_set) { >> ... >> } >> } >> >> --> >> >> if (!profiled.set && !non_profiled.set) { >> .. >> } >> if (profiled.set && !non_profiled.set) { >> .. >> } >> if (!profiled.set && non_profiled.set) { >> .. >> } >> if (!non_nmethod.set && profiled.set && non_profiled.set) { >> .. >> } >> >> >> With this change, PrintFlagsFinal shows the actual segment sizes (not an intermediate value before alignment), and the segments completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > cleanup & test udpdate src/hotspot/share/code/codeCache.cpp line 232: > 230: // segment size ever if it was set explicitly. > 231: non_profiled.size += profiled.size; > 232: // Profiled segment is not available, forcibly set size to 0 Profiled code heap is not available, ... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1453861871 From henryjen at openjdk.org Tue Jan 16 19:05:44 2024 From: henryjen at openjdk.org (Henry Jen) Date: Tue, 16 Jan 2024 19:05:44 GMT Subject: RFR: Merge bf7bd9a16c172bcb5ea6b24717a0429e12e2e3d1 [v2] In-Reply-To: References: Message-ID: > CPU24_01 fixes. Henry Jen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'openjdk:master' into cpu2401 - 8317547: Enhance TLS connection support Reviewed-by: ahgross, rhalade, weijun, valeriep - 8314307: Improve loop handling Co-authored-by: Christian Hagedorn Co-authored-by: Roland Westrelin Co-authored-by: Emanuel Peter Reviewed-by: mschoene, rhalade, thartmann, epeter - 8318588: Windows build failure after JDK-8314468 due to ambiguous call Reviewed-by: epeter - 8314468: Improve Compiler loops Co-authored-by: Dean Long Reviewed-by: rhalade, mschoene, iveresov, kvn - 8317331: Solaris build failed with "declaration can not follow a statement (E_DECLARATION_IN_CODE)" Backport-of: 852276d1f833d49802693f2a5a82ba6eb2722de6 - 8314295: Enhance verification of verifier Reviewed-by: mschoene, rhalade, dholmes, dlong - 8308204: Enhanced certificate processing Reviewed-by: mschoene, rhalade, jnimeh ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17448/files - new: https://git.openjdk.org/jdk/pull/17448/files/bf7bd9a1..e4e0d987 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17448&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17448&range=00-01 Stats: 484 lines in 21 files changed: 304 ins; 157 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/17448.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17448/head:pull/17448 PR: https://git.openjdk.org/jdk/pull/17448 From coleenp at openjdk.org Tue Jan 16 21:17:00 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Jan 2024 21:17:00 GMT Subject: RFR: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Thu, 5 Oct 2023 17:19:35 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that fixes lock ranking after recent changes to the code root set, now using a CHT. > > The issue came up because the lock rank of the CHT lock has been larger than the rank of the Servicethread_lock where it is possible that code roots can be added. > > The suggested solution is to fix up the lock rankings to work; actually this PR contains two variants: > 1) one that statically sets the lock ranks of the CHT lock (and the ThreadSMR_lock that can be used during CHT operation) to something smaller than Servicethread_lock. > 2) one that allows setting of the CHT lock rank via parameter as well (the last commit changed the code to variant 1). > > The other lock ranking changes to Metaspace_lock and ContinuationRelativize_lock are simply undos of the respective changes in [JDK-8315503](https://bugs.openjdk.org/browse/JDK-8315503). > > Testing: tier1-8 for variant 2), tier 1-7 for variant 1) > > Thanks, > Thomas Going with variant 2. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16062#issuecomment-1894522950 From kvn at openjdk.org Tue Jan 16 22:31:30 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Jan 2024 22:31:30 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4111: > 4109: if ((UseAVX == 2) && EnableX86ECoreOpts && VM_Version::supports_avx2()) { > 4110: StubRoutines::_string_indexof = generate_string_indexof(); > 4111: } What motivation for this extensive new code only for avx2? 30% is nice (for some cases) but it is enabled only for AVX2 and not for avx512 which all modern x86 CPUs have so the code will not be used. Or it is typo? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1454139710 From sgibbons at openjdk.org Tue Jan 16 23:53:53 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 16 Jan 2024 23:53:53 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 22:27:52 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: >> >> - Merge branch 'openjdk:master' into indexof >> - Merge branch 'openjdk:master' into indexof >> - Addressing review comments. >> - Fix for JDK-8321599 >> - Support UU IndexOf >> - Only use optimization when EnableX86ECoreOpts is true >> - Fix whitespace >> - Merge branch 'openjdk:master' into indexof >> - Comments; added exhaustive-ish test >> - Subtracting 0x10 twice. >> - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4111: > >> 4109: if ((UseAVX == 2) && EnableX86ECoreOpts && VM_Version::supports_avx2()) { >> 4110: StubRoutines::_string_indexof = generate_string_indexof(); >> 4111: } > > What motivation for this extensive new code only for avx2? 30% is nice (for some cases) but it is enabled only for AVX2 and not for avx512 which all modern x86 CPUs have so the code will not be used. > > Or it is typo? This is acceleration for AVX2, replacing the pcmpestri instruction which is microcoded on E-cores and causes significant performance impact. I am working on a pared-down implementation and should update this PR in a couple of days. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1454217437 From pchilanomate at openjdk.org Tue Jan 16 23:54:22 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Jan 2024 23:54:22 GMT Subject: RFR: 8320302: compiler/arguments/TestC1Globals.java hits SIGSEGV in ContinuationEntry::set_enter_code Message-ID: <1Cm1SGGaZsYxM-RfPNDeAvnGijQ-taVe438SS6bZ3YI=.d64abde2-f9a0-4589-ab40-eba7c3c02992@github.com> When creating Continuation.enterSpecial/doYield special native nmethods we currently assume the call to nmethod::new_native_nmethod() always succeeds. If the CodeCache happens to be full though, creating the nmethod will fail and we'll hit a SIGSEGV when trying to dereference the return value. Since this happens the first time those methods are resolved, throwing an exception at that point implying that we cannot run virtual threads looks odd from a user perspective. So instead I added a step to initialize the Continuation class at startup so that failure to create those nmethods is treated early as a fatal error as we do with any other critical resource needed by the VM. I measured the added step to the startup sequence to be ~90us, or about 0.3% of the total?startup time. I verified the fix by running the test mentioned in the bug plus I also run tiers1-5 in mach5. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/17455/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17455&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320302 Stats: 30 lines in 8 files changed: 22 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17455.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17455/head:pull/17455 PR: https://git.openjdk.org/jdk/pull/17455 From kvn at openjdk.org Wed Jan 17 00:15:52 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Jan 2024 00:15:52 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 23:51:15 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4111: >> >>> 4109: if ((UseAVX == 2) && EnableX86ECoreOpts && VM_Version::supports_avx2()) { >>> 4110: StubRoutines::_string_indexof = generate_string_indexof(); >>> 4111: } >> >> What motivation for this extensive new code only for avx2? 30% is nice (for some cases) but it is enabled only for AVX2 and not for avx512 which all modern x86 CPUs have so the code will not be used. >> >> Or it is typo? > > This is acceleration for AVX2, replacing the pcmpestri instruction which is microcoded on E-cores and causes significant performance impact. I am working on a pared-down implementation and should update this PR in a couple of days. Thank you for explanation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1454238988 From jiangli at openjdk.org Wed Jan 17 00:22:02 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 17 Jan 2024 00:22:02 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking Message-ID: Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. Contributed by Chuck Rasbold and @jianglizhou. ------------- Commit messages: - 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking Changes: https://git.openjdk.org/jdk/pull/17456/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17456&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311846 Stats: 10 lines in 4 files changed: 5 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17456.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17456/head:pull/17456 PR: https://git.openjdk.org/jdk/pull/17456 From dholmes at openjdk.org Wed Jan 17 00:40:59 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 17 Jan 2024 00:40:59 GMT Subject: [jdk22] RFR: 8323243: JNI invocation of an abstract instance method corrupts the stack In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 18:24:44 GMT, Ioi Lam wrote: >> Hi all, >> >> This pull request contains a backport of commit [71d9a83d](https://github.com/openjdk/jdk/commit/71d9a83dece7eb4bdb6ffdd9caf14a1348045ce0) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. >> >> The commit being backported was authored by David Holmes on 14 Jan 2024 and was reviewed by Coleen Phillimore and Aleksey Shipilev. >> >> Thanks! > > Approving it as the backport is clean. Thanks @iklam ! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/73#issuecomment-1894742066 From dholmes at openjdk.org Wed Jan 17 00:41:01 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 17 Jan 2024 00:41:01 GMT Subject: [jdk22] Integrated: 8323243: JNI invocation of an abstract instance method corrupts the stack In-Reply-To: References: Message-ID: On Sun, 14 Jan 2024 22:12:17 GMT, David Holmes wrote: > Hi all, > > This pull request contains a backport of commit [71d9a83d](https://github.com/openjdk/jdk/commit/71d9a83dece7eb4bdb6ffdd9caf14a1348045ce0) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by David Holmes on 14 Jan 2024 and was reviewed by Coleen Phillimore and Aleksey Shipilev. > > Thanks! This pull request has now been integrated. Changeset: b40b1882 Author: David Holmes URL: https://git.openjdk.org/jdk22/commit/b40b18823b543a51a11821a0b73717642374b113 Stats: 159 lines in 4 files changed: 159 ins; 0 del; 0 mod 8323243: JNI invocation of an abstract instance method corrupts the stack Reviewed-by: iklam Backport-of: 71d9a83dece7eb4bdb6ffdd9caf14a1348045ce0 ------------- PR: https://git.openjdk.org/jdk22/pull/73 From erikj at openjdk.org Wed Jan 17 01:24:51 2024 From: erikj at openjdk.org (Erik Joelsson) Date: Wed, 17 Jan 2024 01:24:51 GMT Subject: RFR: Merge bf7bd9a16c172bcb5ea6b24717a0429e12e2e3d1 [v2] In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 19:05:44 GMT, Henry Jen wrote: >> CPU24_01 fixes. > > Henry Jen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'openjdk:master' into cpu2401 > - 8317547: Enhance TLS connection support > > Reviewed-by: ahgross, rhalade, weijun, valeriep > - 8314307: Improve loop handling > > Co-authored-by: Christian Hagedorn > Co-authored-by: Roland Westrelin > Co-authored-by: Emanuel Peter > Reviewed-by: mschoene, rhalade, thartmann, epeter > - 8318588: Windows build failure after JDK-8314468 due to ambiguous call > > Reviewed-by: epeter > - 8314468: Improve Compiler loops > > Co-authored-by: Dean Long > Reviewed-by: rhalade, mschoene, iveresov, kvn > - 8317331: Solaris build failed with "declaration can not follow a statement (E_DECLARATION_IN_CODE)" > > Backport-of: 852276d1f833d49802693f2a5a82ba6eb2722de6 > - 8314295: Enhance verification of verifier > > Reviewed-by: mschoene, rhalade, dholmes, dlong > - 8308204: Enhanced certificate processing > > Reviewed-by: mschoene, rhalade, jnimeh Marked as reviewed by erikj (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17448#pullrequestreview-1826181933 From erikj at openjdk.org Wed Jan 17 01:25:54 2024 From: erikj at openjdk.org (Erik Joelsson) Date: Wed, 17 Jan 2024 01:25:54 GMT Subject: [jdk22] RFR: Merge c7f1c97312f94b6dd6398a5e98dd0c8b63db4c9b In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 16:31:32 GMT, Henry Jen wrote: > CPU24_01 fixes. Marked as reviewed by erikj (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/83#pullrequestreview-1826185706 From henryjen at openjdk.org Wed Jan 17 01:45:03 2024 From: henryjen at openjdk.org (Henry Jen) Date: Wed, 17 Jan 2024 01:45:03 GMT Subject: Integrated: Merge bf7bd9a16c172bcb5ea6b24717a0429e12e2e3d1 In-Reply-To: References: Message-ID: <3HvjxNSJNsiVEgttXj91XvlqSDmGh_ZGJxcDh0qNzgo=.51d3fa63-788a-47db-ad1c-4db8b68aae1b@github.com> On Tue, 16 Jan 2024 16:32:27 GMT, Henry Jen wrote: > CPU24_01 fixes. This pull request has now been integrated. Changeset: 2063bb8f Author: Henry Jen URL: https://git.openjdk.org/jdk/commit/2063bb8ffabd6096f547ec6da979cfcf68a56ba3 Stats: 736 lines in 16 files changed: 476 ins; 65 del; 195 mod Merge Reviewed-by: erikj ------------- PR: https://git.openjdk.org/jdk/pull/17448 From henryjen at openjdk.org Wed Jan 17 01:45:28 2024 From: henryjen at openjdk.org (Henry Jen) Date: Wed, 17 Jan 2024 01:45:28 GMT Subject: [jdk22] RFR: Merge c7f1c97312f94b6dd6398a5e98dd0c8b63db4c9b [v2] In-Reply-To: References: Message-ID: > CPU24_01 fixes. Henry Jen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/jdk22/pull/83/files - new: https://git.openjdk.org/jdk22/pull/83/files/c7f1c973..c7f1c973 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk22&pr=83&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk22&pr=83&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/83.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/83/head:pull/83 PR: https://git.openjdk.org/jdk22/pull/83 From henryjen at openjdk.org Wed Jan 17 01:45:29 2024 From: henryjen at openjdk.org (Henry Jen) Date: Wed, 17 Jan 2024 01:45:29 GMT Subject: [jdk22] Integrated: Merge c7f1c97312f94b6dd6398a5e98dd0c8b63db4c9b In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 16:31:32 GMT, Henry Jen wrote: > CPU24_01 fixes. This pull request has now been integrated. Changeset: b2cc1890 Author: Henry Jen URL: https://git.openjdk.org/jdk22/commit/b2cc1890ff4d2e5404e153ecba5e83f1bcdd6fa7 Stats: 736 lines in 16 files changed: 476 ins; 65 del; 195 mod Merge Reviewed-by: erikj ------------- PR: https://git.openjdk.org/jdk22/pull/83 From fyang at openjdk.org Wed Jan 17 03:40:54 2024 From: fyang at openjdk.org (Fei Yang) Date: Wed, 17 Jan 2024 03:40:54 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v3] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: <_o4Y-fa-_MRTwlXoH78WjpWIszSDASOjUdDi8IchLLk=.76728f9c-1526-4d97-ace5-007c7a3a71c3@github.com> On Tue, 16 Jan 2024 09:56:46 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Cautious > - Merge branch 'master' into JDK-8321137-reconsider-icstub-align > - Inline new_ic_stub > - Work LGTM. Also performed tier1-3 test with fastdebug build on linux-riscv64 platform. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17277#pullrequestreview-1826417911 From roland at openjdk.org Wed Jan 17 08:35:54 2024 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 17 Jan 2024 08:35:54 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v16] In-Reply-To: References: <_VewIaCJievOloJHJsAnzRIjM9Q95MGmH0QV8n3Fwts=.93509e46-90a0-4b20-a90d-30c1295016d8@github.com> Message-ID: On Mon, 15 Jan 2024 10:23:21 GMT, Emanuel Peter wrote: >> @rwestrel I still believe this is safe. But maybe also ugly. >> >> I looked into making the locking more fine-grained, so that we could avoid unlocking the lock temporarily. >> The biggest problem is in `ciMethodData::load_remaining_extra_data`. Here we first (iteratively) clean, and then assume that we still hold the lock when we copy it for the `ciMethodData`. Hence, it seems the lock has to be held at this outer scope, but then temporarily unlocked to allow calls to `get_method` in `PrepareExtraDataClosure::finish`. > > Alternatives to make it prettier: > Make `prepare_metadata` lock, and pass out an object that holds that lock, i.e. widen the scope of the `MutexLocker`. Maybe this can be done with return-value-optimization? But I'm not sure this is a great idea. Another idea @chhagedorn and I thought about was having some Locker object that you can call lock/unlock on, repeatedly. But once the Locker goes out of scope, it checks if it is in the locked state, and only unlocks then. Or maybe it asserts that it is in the locked state, and then unlocks. > > Because essencially we need to allow the retry-logic to unlock in between tries. But we also still need to access the `uncached_methods` array that is filled inside the locked region. > > I'm not sure such a refactoring is worth it. Let me know what you think @rwestrel Given that logic existed, I think you can leave it as is but a comment that explains why it is safe would be useful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1454939597 From roland at openjdk.org Wed Jan 17 08:35:54 2024 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 17 Jan 2024 08:35:54 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v18] In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 12:35:59 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Complications with ttyl** >> There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. >> >> If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > NoSafepointMutexLocker Looks good to me. Thanks for making the chagnes. Do we want to file a bug to refactor the code? ------------- Marked as reviewed by roland (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16840#pullrequestreview-1826749218 From tschatzl at openjdk.org Wed Jan 17 08:54:03 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 17 Jan 2024 08:54:03 GMT Subject: RFR: 8317440: Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503 In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 21:14:18 GMT, Coleen Phillimore wrote: > Going with variant 2. So a new lock rank issue has been found? *sigh* ------------- PR Comment: https://git.openjdk.org/jdk/pull/16062#issuecomment-1895354216 From epeter at openjdk.org Wed Jan 17 09:26:15 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 17 Jan 2024 09:26:15 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: Message-ID: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: improved comment for Roland ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/78a2cdb6..4dbfe9a7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=17-18 Stats: 7 lines in 1 file changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Wed Jan 17 09:26:17 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 17 Jan 2024 09:26:17 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v18] In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 08:33:14 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> NoSafepointMutexLocker > > Looks good to me. Thanks for making the changes. Do we want to file a bug to refactor the code? @rwestrel thanks for the review! I pushed a comment improvement. What kind of refactoring are you thinking about? Just to remove the `MutexUnlocker` occurrence, or something bigger? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1895409690 From aph at openjdk.org Wed Jan 17 09:31:51 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 17 Jan 2024 09:31:51 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v3] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Tue, 16 Jan 2024 09:56:46 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Cautious > - Merge branch 'master' into JDK-8321137-reconsider-icstub-align > - Inline new_ic_stub > - Work Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17277#pullrequestreview-1826855404 From stuefe at openjdk.org Wed Jan 17 10:08:59 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 17 Jan 2024 10:08:59 GMT Subject: RFR: JDK-8314890: Reduce number of loads for Klass decoding in static code [v12] In-Reply-To: References: Message-ID: On Wed, 15 Nov 2023 14:50:48 GMT, Thomas Stuefe wrote: >> Small change that reduces the number of loads generated by the C++ compiler for a narrow Klass decoding operation (`CompressedKlassPointers::decode_xxx()`. >> >> Stock: three loads (with two probably sharing a cache line) - UseCompressedClassPointers, encoding base and shift. >> >> >> 8b7b62: 48 8d 05 7f 1b c3 00 lea 0xc31b7f(%rip),%rax # 14e96e8 >> 8b7b69: 0f b6 00 movzbl (%rax),%eax >> 8b7b6c: 84 c0 test %al,%al >> 8b7b6e: 0f 84 9c 00 00 00 je 8b7c10 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> >> 8b7b74: 48 8d 15 05 62 c6 00 lea 0xc66205(%rip),%rdx # 151dd80 <_ZN23CompressedKlassPointers6_shiftE> >> 8b7b7b: 8b 7b 08 mov 0x8(%rbx),%edi >> 8b7b7e: 8b 0a mov (%rdx),%ecx >> 8b7b80: 48 8d 15 01 62 c6 00 lea 0xc66201(%rip),%rdx # 151dd88 <_ZN23CompressedKlassPointers5_baseE> >> 8b7b87: 48 d3 e7 shl %cl,%rdi >> 8b7b8a: 48 03 3a add (%rdx),%rdi >> >> >> Patched: one load loads all three. Since shift occupies the lowest 8 bits, compiled code uses 8bit register; ditto the UseCompressedOops flag. >> >> >> 8ba302: 48 8d 05 97 9c c2 00 lea 0xc29c97(%rip),%rax # 14e3fa0 <_ZN23CompressedKlassPointers6_comboE> >> 8ba309: 48 8b 08 mov (%rax),%rcx >> 8ba30c: f6 c5 01 test $0x1,%ch # use compressed klass pointers? >> 8ba30f: 0f 84 9b 00 00 00 je 8ba3b0 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> >> 8ba315: 8b 7b 08 mov 0x8(%rbx),%edi >> 8ba318: 48 d3 e7 shl %cl,%rdi # shift >> 8ba31b: 66 31 c9 xor %cx,%cx # zero out lower 16 bits of base >> 8ba31e: 48 01 cf add %rcx,%rdi # add base >> 8ba321: 8b 4f 08 mov 0x8(%rdi),%ecx >> >> --- >> >> Performance measurements: >> >> G1, doing a full GC over a heap filled with 256 mio life j.l.Object instances. >> >> I see a reduction of Full Pause times between 1.2% and 5%. I am unsure how reliable these numbers are since, despite my efforts (running tests on isolated CPUs etc.), the standard deviation was quite high at ?4%. Still, in general, numbers seemed to go down rather than up. >> >> --- >> >> Future extensions: >> >> This patch uses the fact that the encoding base is aligned to metaspace reser... > > Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/compressedKlass.hpp > > Co-authored-by: Aleksey Shipil?v I have not forgotten this PR, but I am currently revamping parts of it for Lilliput, that takes precedence. @rkennke > (And regarding the bitfield discussion - I would like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) Yes. I wonder whether this would result in better code generated by the C++ compiler, since it could mean using e.g. word-sized moves instead of 64 bit moves with manual shifting. I may be wrong though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15389#issuecomment-1895484689 From aph at openjdk.org Wed Jan 17 10:09:50 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 17 Jan 2024 10:09:50 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. Hooboy, this is an ugly solution, with some nasty side effects such as confusing error mesasges for developers and a very confusing debugger experience. Let's try to find a solution with a smaller blast radius. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1895486108 From kbarrett at openjdk.org Wed Jan 17 10:25:54 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 17 Jan 2024 10:25:54 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 15:52:07 GMT, Kim Barrett wrote: > gcc: https://gcc.gnu.org/gcc-9/changes.html > "The C++17 implementation is no longer experimental." Bumping to gcc10 rather than gcc9 would have the benefit that we could get a work-alike for C++20 `std::is_constant_evaluated` even though we're not otherwise using C++20. Sufficiently recent versions of all of our supported compilers provide `__builtin_is_constant_evaluated`. That first shows up in gcc10. We're already requiring versions of clang and msvc that provide it. There are a bunch of potential improvements we could make by having such a work-alike. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1895511862 From aph at openjdk.org Wed Jan 17 10:32:54 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 17 Jan 2024 10:32:54 GMT Subject: RFR: JDK-8314890: Reduce number of loads for Klass decoding in static code [v12] In-Reply-To: <8AeUTRxo3TXnyHL098Hif37yZEuq9uX-MhGufePs5r4=.d888119c-02f0-4558-96a2-780e6d9eeddf@github.com> References: <8AeUTRxo3TXnyHL098Hif37yZEuq9uX-MhGufePs5r4=.d888119c-02f0-4558-96a2-780e6d9eeddf@github.com> Message-ID: On Thu, 11 Jan 2024 19:43:17 GMT, Roman Kennke wrote: >> Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update src/hotspot/share/oops/compressedKlass.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/oops/compressedKlass.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/oops/compressedKlass.hpp >> >> Co-authored-by: Aleksey Shipil?v > > Looks good to me! Thank you! > (And regarding the bitfield discussion - I *would* like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) > @rkennke > > > (And regarding the bitfield discussion - I would like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) > > Yes. I wonder whether this would result in better code generated by the C++ compiler, I doubt it very much. FYI, I believe that every compiler recognizes the canonical bitfield extract `(x << n) >> m` and some e.g. AArch64 have a single-instruction for that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15389#issuecomment-1895522679 From ihse at openjdk.org Wed Jan 17 11:21:59 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 17 Jan 2024 11:21:59 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 We have been stuck on a very old gcc for a long time, due to various reasons. Partly because old gcc versions were not as terrible as old versions of cl.exe, and partly to support odd linux platforms where newer gcc versions were not available. It is tempting to raise the bar to get better functionality available on all platforms. In the end, it is a balance between supporting older platforms, and getting a better common language level for the code. gcc 10 was released 3+ years ago. I guess that is good enough to consider it a reasonable new minimum. I will make a separate announcement on the build-dev list to draw attention to the fact that we want to raise minimum compiler versions, which might not be apparent from the title of this PR, to give folks a better chance at voicing concerns. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1895603901 From amitkumar at openjdk.org Wed Jan 17 12:12:15 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 17 Jan 2024 12:12:15 GMT Subject: RFR: 8315750: Update subtype check profile collection on PPC following 8308869 Message-ID: s390x Implementation for https://github.com/openjdk/jdk/pull/14375 Benchmark Result with patch: Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1155.409 ? 43.844 ops/us RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 726.923 ? 54.536 ops/us RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 676.462 ? 23.503 ops/us RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 118.650 ? 2.653 ops/us Without Patch: Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1101.248 ? 103.559 ops/us RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 109.690 ? 3.312 ops/us RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 110.790 ? 7.927 ops/us RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 112.244 ? 6.889 ops/us Testing : Fastdebug build + tier1 tests ------------- Commit messages: - s390 Port Changes: https://git.openjdk.org/jdk/pull/17461/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17461&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8315750 Stats: 116 lines in 4 files changed: 20 ins; 67 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/17461.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17461/head:pull/17461 PR: https://git.openjdk.org/jdk/pull/17461 From shade at openjdk.org Wed Jan 17 12:12:52 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 Jan 2024 12:12:52 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests Any other reviews needed for this? Nominally, this changes the test groups in langtools, so maybe @lahodaj or @biboudis want to take a look. For jaxp, @JoeWang-Java, maybe? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17422#issuecomment-1895680732 From stuefe at openjdk.org Wed Jan 17 12:37:56 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 17 Jan 2024 12:37:56 GMT Subject: RFR: JDK-8314890: Reduce number of loads for Klass decoding in static code [v12] In-Reply-To: <8AeUTRxo3TXnyHL098Hif37yZEuq9uX-MhGufePs5r4=.d888119c-02f0-4558-96a2-780e6d9eeddf@github.com> References: <8AeUTRxo3TXnyHL098Hif37yZEuq9uX-MhGufePs5r4=.d888119c-02f0-4558-96a2-780e6d9eeddf@github.com> Message-ID: On Thu, 11 Jan 2024 19:43:17 GMT, Roman Kennke wrote: >> Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update src/hotspot/share/oops/compressedKlass.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/oops/compressedKlass.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/oops/compressedKlass.hpp >> >> Co-authored-by: Aleksey Shipil?v > > Looks good to me! Thank you! > (And regarding the bitfield discussion - I *would* like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) > > @rkennke > > > (And regarding the bitfield discussion - I would like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) > > > > > > Yes. I wonder whether this would result in better code generated by the C++ compiler, > > I doubt it very much. FYI, I believe that every compiler recognizes the canonical bitfield extract `(x << n) >> m` and some e.g. AArch64 have a single-instruction for that. That's not what I meant. I meant that if the compiler recognizes that the extracted data is word-sized and resides at a word boundary, it could use a word-sized load instead of loading a 64-bit value and then shifting or bitfield-extracting. But it seems to do that already. I tried three variants in godbolt and all give me the same word-sized loads (GCC ARM64): struct MarkWord_1 { unsigned long long value; unsigned get_hash() const { return value >> 32; } }; struct MarkWord_2 { unsigned other_stuff; unsigned hash; unsigned get_hash() const { return hash; } }; struct MarkWord_3 { unsigned other_stuff: 32; unsigned hash: 32; }; unsigned get_hash_1(struct MarkWord_1* p) { return p->get_hash(); } unsigned get_hash_2(struct MarkWord_2* p) { return p->get_hash(); } unsigned get_hash_3(struct MarkWord_3* p) { return p->hash; } results in get_hash_1(MarkWord_1*): ldr w0, [x0, #4] ret get_hash_2(MarkWord_2*): ldr w0, [x0, #4] ret get_hash_3(MarkWord_3*): ldr w0, [x0, #4] ret So, bitfields are good for code readability but probably won't improve code quality. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15389#issuecomment-1895721227 From shade at openjdk.org Wed Jan 17 12:42:55 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 Jan 2024 12:42:55 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 For me, there is a huge question on portable JDK builds, which are usually built with the lowest GCC toolchain possible to avoid GLIBC incompatibilities. I am pretty sure currently built portable builds are _not_ riding as high as GCC 10. Looks like the majority of C++17 features were implemented in GCC 6 and GCC 7: https://gcc.gnu.org/projects/cxx-status.html#cxx17, and how Kim mentions, C++17 is no longer experimental for GCC 9. So maybe we should not rush GCC 10, and at most have GCC 9 as minimum. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1895729165 From fbredberg at openjdk.org Wed Jan 17 12:46:53 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 17 Jan 2024 12:46:53 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. When I was browsing the interweb I saw that it's not uncommon to use isb instead of yield while spinning on AArch64. Before jumping on the bandwagon I created a test program to measure how long time it takes to issue a large number of instructions from several threads running in parallel. I tested nop, yield and isb on Apple's M1, M2 and M3 CPUs. The yield instruction doesn't take longer to execute than a nop instruction (in fact it takes less time than nop). However isb always takes significantly longer time to run than nop or yield on all of the above mentioned Apple CPUs. This finding combined with the fact that the JVM today uses isb as default for Neoverse CPUs, justified the use of isb on Apple's M1-M3 CPUs. But I do agree with both @theRealAph and @stooart-mon, isb is not intended for this purpose. It might create a delay that is too long for spinning purposes and applications overall won't necessarily show any benefit from isb vs yield. Maybe the most reasonable way forward is to only change the default value of OnSpinWaitInst from "none" to "yield" and NOT change it to "isb" for Apple CPUs. After all, that would make us use the "correct" spinning instruction on all AArch64 CPUs (except Neoverse). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1895736557 From shade at openjdk.org Wed Jan 17 12:48:57 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 Jan 2024 12:48:57 GMT Subject: RFR: JDK-8314890: Reduce number of loads for Klass decoding in static code [v12] In-Reply-To: References: Message-ID: On Wed, 15 Nov 2023 14:50:48 GMT, Thomas Stuefe wrote: >> Small change that reduces the number of loads generated by the C++ compiler for a narrow Klass decoding operation (`CompressedKlassPointers::decode_xxx()`. >> >> Stock: three loads (with two probably sharing a cache line) - UseCompressedClassPointers, encoding base and shift. >> >> >> 8b7b62: 48 8d 05 7f 1b c3 00 lea 0xc31b7f(%rip),%rax # 14e96e8 >> 8b7b69: 0f b6 00 movzbl (%rax),%eax >> 8b7b6c: 84 c0 test %al,%al >> 8b7b6e: 0f 84 9c 00 00 00 je 8b7c10 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> >> 8b7b74: 48 8d 15 05 62 c6 00 lea 0xc66205(%rip),%rdx # 151dd80 <_ZN23CompressedKlassPointers6_shiftE> >> 8b7b7b: 8b 7b 08 mov 0x8(%rbx),%edi >> 8b7b7e: 8b 0a mov (%rdx),%ecx >> 8b7b80: 48 8d 15 01 62 c6 00 lea 0xc66201(%rip),%rdx # 151dd88 <_ZN23CompressedKlassPointers5_baseE> >> 8b7b87: 48 d3 e7 shl %cl,%rdi >> 8b7b8a: 48 03 3a add (%rdx),%rdi >> >> >> Patched: one load loads all three. Since shift occupies the lowest 8 bits, compiled code uses 8bit register; ditto the UseCompressedOops flag. >> >> >> 8ba302: 48 8d 05 97 9c c2 00 lea 0xc29c97(%rip),%rax # 14e3fa0 <_ZN23CompressedKlassPointers6_comboE> >> 8ba309: 48 8b 08 mov (%rax),%rcx >> 8ba30c: f6 c5 01 test $0x1,%ch # use compressed klass pointers? >> 8ba30f: 0f 84 9b 00 00 00 je 8ba3b0 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> >> 8ba315: 8b 7b 08 mov 0x8(%rbx),%edi >> 8ba318: 48 d3 e7 shl %cl,%rdi # shift >> 8ba31b: 66 31 c9 xor %cx,%cx # zero out lower 16 bits of base >> 8ba31e: 48 01 cf add %rcx,%rdi # add base >> 8ba321: 8b 4f 08 mov 0x8(%rdi),%ecx >> >> --- >> >> Performance measurements: >> >> G1, doing a full GC over a heap filled with 256 mio life j.l.Object instances. >> >> I see a reduction of Full Pause times between 1.2% and 5%. I am unsure how reliable these numbers are since, despite my efforts (running tests on isolated CPUs etc.), the standard deviation was quite high at ?4%. Still, in general, numbers seemed to go down rather than up. >> >> --- >> >> Future extensions: >> >> This patch uses the fact that the encoding base is aligned to metaspace reser... > > Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/compressedKlass.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/compressedKlass.hpp > > Co-authored-by: Aleksey Shipil?v > (And regarding the bitfield discussion - I _would_ like to handle object header fields as bitfield, eventually. Not sure if it makes sense though or if we ever get there...) Needed to remind myself that bitfields are actually funky under C++ memory model: the adjacent bitfields of non-zero length are basically taken as a [single memory location](https://en.cppreference.com/w/cpp/language/memory_model#Memory_location), which raises all sort of questions what happens under concurrent access to them. Let's avoid walking into that mess :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/15389#issuecomment-1895739624 From shade at openjdk.org Wed Jan 17 12:52:07 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 Jan 2024 12:52:07 GMT Subject: RFR: 8321137: Reconsider ICStub alignment [v3] In-Reply-To: References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Tue, 16 Jan 2024 09:56:46 GMT, Aleksey Shipilev wrote: >> This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. >> >> Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. >> >> Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: >> - ARM32: 32 -> 64 bytes :( >> - AArch64: 128 -> 64 bytes :) >> - x86_64: 64 -> 64 bytes :| >> - x86_32: 32 -> 64 bytes :( >> - PPC64: 512 -> 128 bytes :)) >> - RISC-V: 128 -> 64 bytes :) >> - S390X: 128 -> 256 bytes :( >> - Zero: >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Cautious > - Merge branch 'master' into JDK-8321137-reconsider-icstub-align > - Inline new_ic_stub > - Work OK, thanks! I am going to merge it, and look for follow-ups, if any. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17277#issuecomment-1895742298 From shade at openjdk.org Wed Jan 17 12:52:09 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 Jan 2024 12:52:09 GMT Subject: Integrated: 8321137: Reconsider ICStub alignment In-Reply-To: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> References: <0EUZYQkWKTxkqUoBLat4SkZSWB9BzdnpvY1RDbk9u8k=.44c519c6-9939-4eab-9d3f-e4c0dabc992d@github.com> Message-ID: On Fri, 5 Jan 2024 11:32:03 GMT, Aleksey Shipilev wrote: > This continues from #16911. It initially started as performance optimization to compact `ICStubs`, but I think the safety arguments for fitting the `ICStub` per instruction cache line prevails. See bug and previous PR for more gory details. The footprint improvements on some architectures come as side-effect of untying the `ICStub` size from `CodeEntryAlignment` to (sometimes lower) cache line size. > > Note that the size of `ICStub` is important, because `ICBuffer` is small (10K by default), and its depletion causes the `ICBufferFull` safepoint. I would make a (separate) argument to bump the default `ICBuffer` size a bit to make it less important. > > Current patch affects `ICStub` size in different ways on different platforms, since current size is effectively 2x`CodeEntryAlignment` and new size is cache line size: > - ARM32: 32 -> 64 bytes :( > - AArch64: 128 -> 64 bytes :) > - x86_64: 64 -> 64 bytes :| > - x86_32: 32 -> 64 bytes :( > - PPC64: 512 -> 128 bytes :)) > - RISC-V: 128 -> 64 bytes :) > - S390X: 128 -> 256 bytes :( > - Zero: > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > - [x] Linux AArch64 server fastdebug, `tier{1,2,3,4}` This pull request has now been integrated. Changeset: 7be9f1d0 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/7be9f1d0540907f82800e717389bc3c2da3a8805 Stats: 74 lines in 6 files changed: 39 ins; 16 del; 19 mod 8321137: Reconsider ICStub alignment Reviewed-by: dlong, eosterlund, mdoerr, fyang, aph ------------- PR: https://git.openjdk.org/jdk/pull/17277 From erikj at openjdk.org Wed Jan 17 13:47:58 2024 From: erikj at openjdk.org (Erik Joelsson) Date: Wed, 17 Jan 2024 13:47:58 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 12:39:43 GMT, Aleksey Shipilev wrote: > For me, there is a huge question on portable JDK builds, which are usually built with the lowest GCC toolchain possible to avoid GLIBC incompatibilities. I am pretty sure currently built portable builds are _not_ riding as high as GCC 10. I'm not sure if you are referring to something else, but Oracle's builds of the JDK are intended to be portable and we are currently on GCC 13.2.0. There is no need to keep the GCC version low for portable builds, just the libs in the sysroot (which is where the GLIBC dependency is decided). We statically link libstdc++ and libgcc to avoid incompatibilities from GCC. This is why we use "devkits", to be able to combine a modern GCC with the lowest denominator sysroot that we need for our support matrix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1895844654 From eastigeevich at openjdk.org Wed Jan 17 15:05:51 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 17 Jan 2024 15:05:51 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> Message-ID: <00KH-IYiXMv5YGxHhUs-lWKBdRDt9h5iAA6aeX0JwS4=.5564cfcb-8283-486e-b7b9-558cad834331@github.com> On Wed, 17 Jan 2024 12:44:00 GMT, Fredrik Bredberg wrote: > Maybe the most reasonable way forward is to only change the default value of OnSpinWaitInst from "none" to "yield" and NOT change it to "isb" for Apple CPUs. Do we have anyone from Apple who can suggest a spin pause implementation? As no real cases for Apple CPUs exists, just microbenchmarks, choosing `isb` might be premature. IMO, even without real cases I would have chosen `isb` if it had a similar latency as the Intel `pause`. > We could execute a bunch of UDIV instructions with a loop-carried dependency, or cycle an xor-shift generator. That could be made to delay for any number of clock cycles, so we can delay without the side effects of an ISB. This approach is not power efficient. In case of Neoverse `isb` have shown to use less power than any instruction executing on the CPU back-end. If Apple CPUs have the similar `isb` behaviour, it would be a reason to use `isb`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1895995770 From roland at openjdk.org Wed Jan 17 15:09:55 2024 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 17 Jan 2024 15:09:55 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v18] In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 08:33:14 GMT, Roland Westrelin wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> NoSafepointMutexLocker > > Looks good to me. Thanks for making the changes. Do we want to file a bug to refactor the code? > @rwestrel thanks for the review! I pushed a comment improvement. > > What kind of refactoring are you thinking about? Just to remove the `MutexUnlocker` occurrence, or something bigger? I was thinking maybe something along the lines of what Tom suggested: " I think the API would need to make a stronger split between preallocated records and records which might come from the extra data section. " It doesn't seem right that the API can return a pointer to something that's not safe to access unless we own a lock too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1896005676 From gunnar at wagenknecht.org Wed Jan 17 15:39:45 2024 From: gunnar at wagenknecht.org (Gunnar Wagenknecht) Date: Wed, 17 Jan 2024 16:39:45 +0100 Subject: [External] : Re: Too many open files problem on MacOS 14.1 In-Reply-To: <6f1f1ab9-57cb-40da-ba8d-4e958724bc6c@oracle.com> References: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> <821ab42c-6874-4dea-917b-461074945cb4@oracle.com> <81718069-188A-48F1-85D5-CA3DB10019EA@wagenknecht.org> <6f1f1ab9-57cb-40da-ba8d-4e958724bc6c@oracle.com> Message-ID: -- Gunnar Wagenknecht gunnar at wagenknecht.org, http://guw.io/ > On Dec 23, 2023, at 16:08, daniel.daugherty at oracle.com wrote: > It looks like I cannot add your email as a watcher on this bug because > you do not have an OpenJDK account. Thanks a lot for that and I appreciate the fix that is happening. Can I ask if this fix (once it's merged) can be back ported to JDK 17 and 21? We have a large code base and all IDEs and other Java based tools working with the code base require manual adjustment to the VM args for MacOS only across all developer machines. We are still discovering issues around that. Lots of tools are still on 17 and we are in the process of moving to 21. Thanks! -Gunnar -------------- next part -------------- An HTML attachment was scrubbed... URL: From epeter at openjdk.org Wed Jan 17 15:51:57 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 17 Jan 2024 15:51:57 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v18] In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 15:07:31 GMT, Roland Westrelin wrote: >> Looks good to me. Thanks for making the changes. Do we want to file a bug to refactor the code? > >> @rwestrel thanks for the review! I pushed a comment improvement. >> >> What kind of refactoring are you thinking about? Just to remove the `MutexUnlocker` occurrence, or something bigger? > > I was thinking maybe something along the lines of what Tom suggested: " I think the API would need to make a stronger split between preallocated records and records which might come from the extra data section. " > It doesn't seem right that the API can return a pointer to something that's not safe to access unless we own a lock too. @rwestrel Ok, makes sense. Would you want to do this? I mostly addressed this because it is a bug, but otherwise I have my focus elsewhere. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1896089600 From coleenp at openjdk.org Wed Jan 17 16:18:54 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Jan 2024 16:18:54 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 09:26:15 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Complications with ttyl** >> There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. >> >> If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > improved comment for Roland I haven't yet reviewed all of this but this mechanism seems unnecessary and I'd like to understand why this would be added. src/hotspot/share/runtime/mutexLocker.hpp line 271: > 269: NoSafepointMutexLocker(mutex, true, flag) {} > 270: }; > 271: Don't add this. The locks that are no-safepoint-check-flags have implicit NoSafepointVerifier logic when you take them out. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16840#pullrequestreview-1827673595 PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1455977272 From epeter at openjdk.org Wed Jan 17 16:22:55 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 17 Jan 2024 16:22:55 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 16:15:39 GMT, Coleen Phillimore wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> improved comment for Roland > > src/hotspot/share/runtime/mutexLocker.hpp line 271: > >> 269: NoSafepointMutexLocker(mutex, true, flag) {} >> 270: }; >> 271: > > Don't add this. The locks that are no-safepoint-check-flags have implicit NoSafepointVerifier logic when you take them out. Ah ok. Thanks for the hint. I will look into this. It is a bit confusing what the locks do and do not do with SafePoints. Do they just not safepoint when trying to acquire the lock, or also verify that no safepoint is made while holding the lock? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1455987948 From coleenp at openjdk.org Wed Jan 17 16:27:00 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Jan 2024 16:27:00 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 09:26:15 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Complications with ttyl** >> There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. >> >> If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > improved comment for Roland src/hotspot/share/code/compiledMethod.cpp line 698: > 696: print_code_on(&ss); > 697: print_pcs_on(&ss); > 698: tty->print("%s", ss.as_string()); // print all at once It seems like these ttyLocker changes should be checked in as a different cleanup, ie removing ttyLocker is a really good thing. Can you make these changes a separate patch? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1455998025 From coleenp at openjdk.org Wed Jan 17 16:33:55 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Jan 2024 16:33:55 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 16:20:04 GMT, Emanuel Peter wrote: >> src/hotspot/share/runtime/mutexLocker.hpp line 271: >> >>> 269: NoSafepointMutexLocker(mutex, true, flag) {} >>> 270: }; >>> 271: >> >> Don't add this. The locks that are no-safepoint-check-flags have implicit NoSafepointVerifier logic when you take them out. > > Ah ok. Thanks for the hint. I will look into this. It is a bit confusing what the locks do and do not do with SafePoints. Do they just not safepoint when trying to acquire the lock, or also verify that no safepoint is made while holding the lock? They verify that no safepoints happen while holding the lock. There's a counter in JavaThread. #ifdef ASSERT // Debug support for checking if code allows safepoints or not. // Safepoints in the VM can happen because of allocation, invoking a VM operation, or blocking on // mutex, or blocking on an object synchronizer (Java locking). // If _no_safepoint_count is non-zero, then an assertion failure will happen in any of // the above cases. The class NoSafepointVerifier is used to set this counter. int _no_safepoint_count; // If 0, thread allow a safepoint to happen ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1456010975 From coleenp at openjdk.org Wed Jan 17 16:33:55 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Jan 2024 16:33:55 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: <2hhjIaAJYQujFSHyCWVmWOaCPWy2XkS-YBcEyEBbg9U=.864537c6-dcfe-435e-9ce9-4f744cdd33c1@github.com> On Wed, 17 Jan 2024 16:29:34 GMT, Coleen Phillimore wrote: >> Ah ok. Thanks for the hint. I will look into this. It is a bit confusing what the locks do and do not do with SafePoints. Do they just not safepoint when trying to acquire the lock, or also verify that no safepoint is made while holding the lock? > > They verify that no safepoints happen while holding the lock. There's a counter in JavaThread. > > > #ifdef ASSERT > // Debug support for checking if code allows safepoints or not. > // Safepoints in the VM can happen because of allocation, invoking a VM operation, or blocking on > // mutex, or blocking on an object synchronizer (Java locking). > // If _no_safepoint_count is non-zero, then an assertion failure will happen in any of > // the above cases. The class NoSafepointVerifier is used to set this counter. > int _no_safepoint_count; // If 0, thread allow a safepoint to happen I guess the comment should be updated to say we increment this when taking out a no-safepoint-check mutex. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1456016934 From aph at openjdk.org Wed Jan 17 16:44:53 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 17 Jan 2024 16:44:53 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: <00KH-IYiXMv5YGxHhUs-lWKBdRDt9h5iAA6aeX0JwS4=.5564cfcb-8283-486e-b7b9-558cad834331@github.com> References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> <00KH-IYiXMv5YGxHhUs-lWKBdRDt9h5iAA6aeX0JwS4=.5564cfcb-8283-486e-b7b9-558cad834331@github.com> Message-ID: On Wed, 17 Jan 2024 15:02:46 GMT, Evgeny Astigeevich wrote: > > Maybe the most reasonable way forward is to only change the default value of OnSpinWaitInst from "none" to "yield" and NOT change it to "isb" for Apple CPUs. > > Do we have anyone from Apple who can suggest a spin pause implementation? As no real cases for Apple CPUs exists, just microbenchmarks, choosing `isb` might be premature. IMO, even without real cases I would have chosen `isb` if it had a similar latency as the Intel `pause`. It'd be nice if we knew what that latency was. > > We could execute a bunch of UDIV instructions with a loop-carried dependency, or cycle an xor-shift generator. That could be made to delay for any number of clock cycles, so we can delay without the side effects of an ISB. > > This approach is not power efficient. Huh? The only real use for SpinPause is to prevent bus contention when trying to acquire a lock. Chances are we only really have to spin for a few dozen cycles before retrying. It's not long enough to affect power consumption much. Are you thinking of a longer pause? > In case of Neoverse `isb` have shown to use less power than any instruction executing on the CPU back-end. If Apple CPUs have the similar `isb` behaviour, it would be a reason to use `isb`. OK, so now I'm really curious, given that ISB has a lot of work to do because it has to flush and restart a bunch of on-the-fly instructions. Can you provide any links for where it's been shown to use less power? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1896192808 From jiangli at openjdk.org Wed Jan 17 17:20:53 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 17 Jan 2024 17:20:53 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 10:07:15 GMT, Andrew Haley wrote: >> Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. >> >> Contributed by Chuck Rasbold and @jianglizhou. > > Hooboy, this is an ugly solution, with some nasty side effects such as confusing error mesasges for developers and a very confusing debugger experience. Let's try to find a solution with a smaller blast radius. Hi @theRealAph Thanks for looking into this! https://github.com/openjdk/jdk/pull/14808 comments touched on several options: 1. Using namespace, in smaller scope for specific class such as `StringTable` or for all hotspot code in a global scope. Most seem to prefer using a specific namespace for all hotspot code, but there were still concerns. 2. Using #define to redefine the symbol (using in the current PR) This is a somewhat hacky solution. It requires small changes without touching many source code for renaming. 3. Redefine symbol at build/compile time. This is similar to the above. 4. Direct rename in the source Earlier discussions and feedback seem to prefer options requiring non-large scale change (except hotspot namespace solution). If acceptable by everyone, direct renaming would be the least confusion causing option. Any other suggestions and ideas for resolving the `Thread` issue? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1896255274 From joehw at openjdk.org Wed Jan 17 17:41:52 2024 From: joehw at openjdk.org (Joe Wang) Date: Wed, 17 Jan 2024 17:41:52 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests Thanks for the reminder. The new alias is nice, easier to run all tier tests. I often run xml-only as well that includes jaxp_all plus a small set of jaxp tests in jdk_all (test/jdk/javax/xml/jaxp). But that's just me, jdk_all already covers those tests. ------------- Marked as reviewed by joehw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17422#pullrequestreview-1827838226 From coleenp at openjdk.org Wed Jan 17 23:02:17 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Jan 2024 23:02:17 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert Message-ID: Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. ------------- Commit messages: - 8323685: PrintSystemDictionaryAtExit has mutex rank assert Changes: https://git.openjdk.org/jdk/pull/17471/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17471&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323685 Stats: 158 lines in 8 files changed: 100 ins; 45 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/17471.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17471/head:pull/17471 PR: https://git.openjdk.org/jdk/pull/17471 From coleenp at openjdk.org Wed Jan 17 23:08:49 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Jan 2024 23:08:49 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. I was reading through the other PR for StringTable and was wonder how difficult it would be to wrap all of hotspot in namespace hotspot {}; using namespace hotspot; It would need a JEP as discussed in the other PR. Alternatively if there's a #ifdef you can use for renaming the Thread to HotspotThread for static linking only, it might make this change less worrysome. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1897336087 From duke at openjdk.org Thu Jan 18 06:54:20 2024 From: duke at openjdk.org (Liming Liu) Date: Thu, 18 Jan 2024 06:54:20 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Wed, 10 Jan 2024 07:30:52 GMT, Thomas Stuefe wrote: >> Liming Liu has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. > > Maybe a stupid question, but if we are still worried about concurrent use of memory that is in the process of being madvised, could we not just limit this technique to initialization time? > > I would expect most uses of pretouch to go together with -Xmx = -Xms, and to happen before mutators start. Hi, @tstuefe , @jdksjolen & @kimbarrett . Could you please take a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1897900967 From dholmes at openjdk.org Thu Jan 18 07:30:14 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 18 Jan 2024 07:30:14 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. > Linking failures were observed when statically linking the launcher executable with hotspot and user native code together: So the problem is that the user native code defines Thread as well - right? So this could keep happening for name after name depending on what native code is being linked. I second what @theRealAph said! This is really ugly. The way disparate libraries just get munged into a single namespace with static linking just seems wrong to me. At a minimum this hack should only be used when doing static linking as Coleen suggested. But I'd much prefer a solution that came from the tools doing the linking. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1897940456 From dholmes at openjdk.org Thu Jan 18 07:39:18 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 18 Jan 2024 07:39:18 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. Okay so now that I have context switched in the discussion from: https://github.com/openjdk/jdk/pull/14808 what happened to doing a JEP for namespaces? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1897950301 From eosterlund at openjdk.org Thu Jan 18 09:36:12 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 18 Jan 2024 09:36:12 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. The main point of this PR is not to figure out what Apple HW should bind to, but rather to figure out what a good default is for unrecognized HW. The current default is "none", and the proposal is to change it to "yield". Since yield is the ISA defined instruction for this exact purpose, I think it makes more sense to use yield instead of none. It is certainly less surprising. It's worth pointing out that new AmpereOne chips do indeed implement yield. By having the unsurprising yield instruction be the default, hopefully more HW vendors will find the motivation to follow suit and implement the instruction. We can have horrible hacks as opt-in for machines that have ignored this, but I think figuring out what said hacks should be and on what machines, is an orthogonal concern. The default should still be "yield", regardless, IMO. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1898115272 From duke at openjdk.org Thu Jan 18 11:52:12 2024 From: duke at openjdk.org (Yude Lin) Date: Thu, 18 Jan 2024 11:52:12 GMT Subject: RFR: 8323273: AArch64: Strengthen CompressedClassPointers initialization check for base In-Reply-To: References: Message-ID: On Tue, 16 Jan 2024 02:41:40 GMT, Yude Lin wrote: > Summary: > Add a platform-dependent check for CompressedClassSpaceBaseAddress; > Remove the "reserve anywhere" attempt after the initial mapping attempt failed---this is rarely used and will likely fail anyway, because the accepted mapping is very restricted on aarch64; > Additional assertions after initialization. > > Passed hotspot/jtreg/:tier1 on fastdebug Can I get a review on this patch please : ) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17437#issuecomment-1898330645 From eastigeevich at openjdk.org Thu Jan 18 13:25:15 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 18 Jan 2024 13:25:15 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> <00KH-IYiXMv5YGxHhUs-lWKBdRDt9h5iAA6aeX0JwS4=.5564cfcb-8283-486e-b7b9-558cad834331@github.com> Message-ID: On Wed, 17 Jan 2024 16:42:17 GMT, Andrew Haley wrote: > OK, so now I'm really curious, given that ISB has a lot of work to do because it has to flush and restart a bunch of on-the-fly instructions. According to our hardware enigeers `isb` don't need to flush anything. It can just stop fetching instructions until the backend gets idle. It's clearly faster and cheaper than mul/div instructions. Mul/div will spin up a whole complex arithmetic unit that might otherwise be idle. Except some cases, mul/div don't really pause CPU because CPU can execute instructions around it. For example it can get more loads out into the pipeline. > Can you provide any links for where it's been shown to use less power? I don't have data I can share. Stuart (@stooart-mon) is from arm. He might ask hardware engineers as well. Maybe he knows more or might provide some data. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1898471039 From eastigeevich at openjdk.org Thu Jan 18 14:35:15 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 18 Jan 2024 14:35:15 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Sat, 6 Jan 2024 14:05:33 GMT, Boris Ulasevich wrote: >> The change simplifies the CodeCache::initialize_heaps segment memory split logic while preserving the existing layout: >> >> if (!non_nmethod_set && !profiled_set && !non_profiled_set) { >> ... >> } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { >> if (non_profiled_set) { >> if (!profiled_set) { >> ... >> } >> } else if (profiled_set) { >> ... >> } else if (non_nmethod_set) { >> ... >> } >> } >> >> --> >> >> if (!profiled.set && !non_profiled.set) { >> .. >> } >> if (profiled.set && !non_profiled.set) { >> .. >> } >> if (!profiled.set && non_profiled.set) { >> .. >> } >> if (!non_nmethod.set && profiled.set && non_profiled.set) { >> .. >> } >> >> >> With this change, PrintFlagsFinal shows the actual segment sizes (not an intermediate value before alignment), and the segments completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > cleanup & test udpdate src/hotspot/share/code/codeCache.cpp line 185: > 183: codeheap, (long long) size, (long long) required_size); > 184: err_msg title("Not enough space in %s to run VM", codeheap); > 185: err_msg message(SIZE_FORMAT "K < " SIZE_FORMAT "K", size, required_size); Missed `/ K` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1457527888 From eastigeevich at openjdk.org Thu Jan 18 15:05:15 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 18 Jan 2024 15:05:15 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Sat, 6 Jan 2024 14:05:33 GMT, Boris Ulasevich wrote: >> The change simplifies the CodeCache::initialize_heaps segment memory split logic while preserving the existing layout: >> >> if (!non_nmethod_set && !profiled_set && !non_profiled_set) { >> ... >> } else if (!non_nmethod_set || !profiled_set || !non_profiled_set) { >> if (non_profiled_set) { >> if (!profiled_set) { >> ... >> } >> } else if (profiled_set) { >> ... >> } else if (non_nmethod_set) { >> ... >> } >> } >> >> --> >> >> if (!profiled.set && !non_profiled.set) { >> .. >> } >> if (profiled.set && !non_profiled.set) { >> .. >> } >> if (!profiled.set && non_profiled.set) { >> .. >> } >> if (!non_nmethod.set && profiled.set && non_profiled.set) { >> .. >> } >> >> >> With this change, PrintFlagsFinal shows the actual segment sizes (not an intermediate value before alignment), and the segments completely fill the ReservedCodeCacheSize (no wasted page due to final down alignment). > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > cleanup & test udpdate src/hotspot/share/code/codeCache.cpp line 181: > 179: GrowableArray* CodeCache::_allocable_heaps = new(mtCode) GrowableArray (static_cast(CodeBlobType::All), mtCode); > 180: > 181: void CodeCache::report_cache_minimal_size_error(const char *codeheap, size_t size, size_t required_size) { I suggest to have a function: static void check_min_size(... code_heap, size_t min_required_size) { if (code_heap.enabled && code_heap.size >= min_required_size) return; log_debug(codecache)("Code heap (%s) size " SIZE_FORMAT " below required minimal size " SIZE_FORMAT, code_heap.name, code_heap.size, min_required_size); err_msg title("Not enough space in %s to run VM", code_heap.name); err_msg message(SIZE_FORMAT "K < " SIZE_FORMAT "K", code_heap.size / K, min_required_size / K); vm_exit_during_initialization(title, message); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1457571407 From epeter at openjdk.org Thu Jan 18 15:21:18 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 15:21:18 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 16:24:20 GMT, Coleen Phillimore wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> improved comment for Roland > > src/hotspot/share/code/compiledMethod.cpp line 698: > >> 696: print_code_on(&ss); >> 697: print_pcs_on(&ss); >> 698: tty->print("%s", ss.as_string()); // print all at once > > It seems like these ttyLocker changes should be checked in as a different cleanup, ie removing ttyLocker is a really good thing. Can you make these changes a separate patch? Yes, I filed the RFE: [JDK-8324129](https://bugs.openjdk.org/browse/JDK-8324129) C2: Remove some ttyLocker usages in preparation for JDK-8306767 https://github.com/openjdk/jdk/pull/17486 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1457593680 From epeter at openjdk.org Thu Jan 18 15:31:24 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 15:31:24 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 Message-ID: I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). @coleenp wished that I do this separately, so I filed this RFE here. ------------- Commit messages: - 8324129 Changes: https://git.openjdk.org/jdk/pull/17486/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17486&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324129 Stats: 73 lines in 11 files changed: 14 ins; 2 del; 57 mod Patch: https://git.openjdk.org/jdk/pull/17486.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17486/head:pull/17486 PR: https://git.openjdk.org/jdk/pull/17486 From dchuyko at openjdk.org Thu Jan 18 15:43:35 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 18 Jan 2024 15:43:35 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v22] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 40 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 30 more: https://git.openjdk.org/jdk/compare/a2b117ae...d4bc29c9 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=21 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From epeter at openjdk.org Thu Jan 18 16:00:37 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 16:00:37 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v20] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: remove NoSafepointMutexLocker ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/4dbfe9a7..f8a81cd3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=18-19 Stats: 29 lines in 10 files changed: 0 ins; 14 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From bulasevich at openjdk.org Thu Jan 18 16:00:50 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 18 Jan 2024 16:00:50 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v3] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: apply suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/d1415359..06907581 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=01-02 Stats: 58 lines in 2 files changed: 15 ins; 26 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From epeter at openjdk.org Thu Jan 18 16:06:30 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 16:06:30 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v21] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: rm pause_no_safepoints ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/f8a81cd3..c408f22e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=19-20 Stats: 7 lines in 2 files changed: 0 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Thu Jan 18 16:13:27 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 16:13:27 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v22] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: add one nsv again ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/c408f22e..3dad2936 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=20-21 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Thu Jan 18 16:13:28 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 16:13:28 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 16:16:19 GMT, Coleen Phillimore wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> improved comment for Roland > > I haven't yet reviewed all of this but this mechanism seems unnecessary and I'd like to understand why this would be added. @coleenp thanks for the suggestions. I will first integrate this RFE: https://github.com/openjdk/jdk/pull/17486 (please review ? ) Then I will integrate and merge it to here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1898780229 From epeter at openjdk.org Thu Jan 18 16:13:30 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 18 Jan 2024 16:13:30 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: <2hhjIaAJYQujFSHyCWVmWOaCPWy2XkS-YBcEyEBbg9U=.864537c6-dcfe-435e-9ce9-4f744cdd33c1@github.com> References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> <2hhjIaAJYQujFSHyCWVmWOaCPWy2XkS-YBcEyEBbg9U=.864537c6-dcfe-435e-9ce9-4f744cdd33c1@github.com> Message-ID: On Wed, 17 Jan 2024 16:31:38 GMT, Coleen Phillimore wrote: >> They verify that no safepoints happen while holding the lock. There's a counter in JavaThread. >> >> >> #ifdef ASSERT >> // Debug support for checking if code allows safepoints or not. >> // Safepoints in the VM can happen because of allocation, invoking a VM operation, or blocking on >> // mutex, or blocking on an object synchronizer (Java locking). >> // If _no_safepoint_count is non-zero, then an assertion failure will happen in any of >> // the above cases. The class NoSafepointVerifier is used to set this counter. >> int _no_safepoint_count; // If 0, thread allow a safepoint to happen > > I guess the comment should be updated to say we increment this when taking out a no-safepoint-check mutex. Ok, I removed the `NoSafepointVerifier`. It seems to work, and pass my verification that we are in a no_safepoint scope. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1457668982 From bulasevich at openjdk.org Thu Jan 18 17:08:29 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 18 Jan 2024 17:08:29 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: apply suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/06907581..9bd9da95 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=02-03 Stats: 12 lines in 1 file changed: 6 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From kvn at openjdk.org Thu Jan 18 18:56:12 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 18 Jan 2024 18:56:12 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Thu, 18 Jan 2024 15:17:27 GMT, Emanuel Peter wrote: > I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. > > Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). > > @coleenp wished that I do this separately, so I filed this RFE here. Looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17486#pullrequestreview-1830366420 From jiangli at openjdk.org Thu Jan 18 18:59:15 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Thu, 18 Jan 2024 18:59:15 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 23:06:19 GMT, Coleen Phillimore wrote: >> Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. >> >> Contributed by Chuck Rasbold and @jianglizhou. > > I was reading through the other PR for StringTable and was wonder how difficult it would be to wrap all of hotspot in namespace hotspot {}; using namespace hotspot; It would need a JEP as discussed in the other PR. > > Alternatively if there's a #ifdef you can use for renaming the Thread to HotspotThread for static linking only, it might make this change less worrysome. Thanks @coleenp, @dholmes-ora. For using a hotspot namespace, there are probably similar complications like the symbol usages that the current PR addresses in src/hotspot/os_cpu/linux_aarch64/threadLS_linux_aarch64.S and src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Thread.java. There might also be some complications with accessing hotspot code in JNI code. Those issues probably could be resolved relatively easily, I haven't experimented it. It seems that we may be converging on using hotspot namespace? For just redefining the symbol only when doing static linking, it adds more differences between the static and non-static support. It's more useful when we can create both `.so` and `.a` from the same set of `.o` files without having to build two different `.o` from each c/c++ source files. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1899039171 From aph at openjdk.org Thu Jan 18 19:13:13 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 18 Jan 2024 19:13:13 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> <00KH-IYiXMv5YGxHhUs-lWKBdRDt9h5iAA6aeX0JwS4=.5564cfcb-8283-486e-b7b9-558cad834331@github.com> Message-ID: On Thu, 18 Jan 2024 13:22:05 GMT, Evgeny Astigeevich wrote: > > OK, so now I'm really curious, given that ISB has a lot of work to do because it has to flush and restart a bunch of on-the-fly instructions. > > According to our hardware enigeers `isb` don't need to flush anything. It can just stop fetching instructions until the backend gets idle. How is that any better? I get that how it might work, but that means that you have to wait for every instruction in progress to retire. And in a CPU with a hundreds of instructions on the fly that's no small thing. You have a choice, either to speculate and then rollback when the ISB is actually executed, or to stop speculating for a while. The effect is the same. > It's clearly faster It's a delay. > and cheaper than mul/div instructions. Mul/div will spin up a whole complex arithmetic unit that might otherwise be idle. Except some cases, mul/div don't really pause CPU because CPU can execute instructions around it. For example it can get more loads out into the pipeline. Definitely so, yes. > Stuart (@stooart-mon) is from arm. He might ask hardware engineers as well. Maybe he knows more or might provide some data. OK. What concerns me is the blast radius of all this. It'd be nice to have some actual experiments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1899058125 From dchuyko at openjdk.org Thu Jan 18 21:39:31 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 18 Jan 2024 21:39:31 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v23] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 31 more: https://git.openjdk.org/jdk/compare/81df265e...b2261505 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=22 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From vlivanov at openjdk.org Thu Jan 18 21:55:36 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 18 Jan 2024 21:55:36 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Thu, 18 Jan 2024 15:17:27 GMT, Emanuel Peter wrote: > I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. > > Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). > > @coleenp wished that I do this separately, so I filed this RFE here. Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17486#pullrequestreview-1830617407 From coleenp at openjdk.org Thu Jan 18 22:39:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Jan 2024 22:39:34 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Wed, 10 Jan 2024 15:39:40 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - top load adjustments > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Fix type > - Move inflated check in fast_locked > - Move top load > - 8319799: Recursive lightweight locking: x86 implementation > - ... and 1 more: https://git.openjdk.org/jdk/compare/730339a3...71c48af6 I like the refactoring a lot, but I have several questions that with comments can help me understand the code better. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 104: > 102: // continuation is the slow_path. > 103: __ jmp(continuation()); > 104: } It seems like two stubs small stubs might be better since they don't really share very much (?) rather than one with two entry points and two exit points. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 120: > 118: // The owner may be anonymous and we removed the last obj entry in > 119: // the lock-stack. This loses the information about the owner. > 120: // Write the thread to the owner field so the runtime knows the owner. I'm confused by this comment. We get here if the monitor is inflated, so we didn't remove it from the lock stack. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 130: > 128: __ movptr(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), NULL_WORD); > 129: > 130: // Fence. // Instead of MFENCE we use a dummy locked add of 0 to the top-of-stack. Can you add this comment? src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 154: > 152: > 153: #ifdef _LP64 > 154: int C2HandleAnonOMOwnerStub::max_size() const { Do we need the C2HandleAnonOMOwnerStub anymore? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 968: > 966: > 967: // Load the mark. > 968: movptr(mark, Address(obj, oopDesc::mark_offset_in_bytes())); Can this potentially throw NPE? Like in the c1 case right? Maybe worth a comment if so. It doesn't look like the c2 code does anything special here like the c1 code does, except this goes to the signal handler, then throws the NPE when it continues. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 977: > 975: jcc(Assembler::notZero, inflated); > 976: > 977: // Check if lock-stack is full. Why doesn't this call MacroAssembler::lightweight_lock here? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1013: > 1011: // Recursive. > 1012: increment(Address(tagged_monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions))); > 1013: } This is sort of like the code above in fast_lock() but it's much clearer separated here. So I think this is good. This function passes in thread so works for 32 bits as well? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1033: > 1031: stop("Fast Lock ZF != 0"); > 1032: bind(zf_correct); > 1033: #endif These are nice. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1120: > 1118: xorptr(reg_rax, reg_rax); > 1119: orptr(reg_rax, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions))); > 1120: jcc(Assembler::notZero, check_successor); I don't know why the LP64/!LP64 paths are different. Do we not decrement recursions on 32 bit, and why wouldn't we? src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9970: > 9968: jcc(Assembler::equal, unlocked); > 9969: > 9970: bind(push_and_slow); Why do we have to push the lock object back on the lock stack? is it because it eagerly pops it off unlike existing code that checks that it can CAS the header to unlocked first? Runtime code in slow_path expects the object to be on the lock stack. This is a different order than the existing code. It doesn't look like it matters though, since lock stack is per thread. ------------- PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1829898309 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458019840 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458041655 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458038267 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458027822 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1457642493 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1457654077 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1457968599 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1457968740 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458013056 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1457573290 From coleenp at openjdk.org Thu Jan 18 22:39:36 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Jan 2024 22:39:36 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: Message-ID: On Mon, 13 Nov 2023 10:41:58 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 123: >> >>> 121: __ movptr(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), _thread); >>> 122: >>> 123: // succsesor null check. >> >> typo: succsesor -> successor > > Done. The typo is still in this version. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458039346 From sspitsyn at openjdk.org Thu Jan 18 23:38:26 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 18 Jan 2024 23:38:26 GMT Subject: RFR: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:09:39 GMT, Serguei Spitsyn wrote: >> The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. >> The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. >> The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. >> One detail to underline is the intrinsic implementation needs to use the argument #0 instead of #1. >> >> Testing: >> - The mach5 tiers 1-6 show no regressions > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge > - 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static PING! It would be nice to get this reviewed. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17298#issuecomment-1899386396 From dholmes at openjdk.org Fri Jan 19 02:00:28 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 19 Jan 2024 02:00:28 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Thu, 18 Jan 2024 18:56:23 GMT, Jiangli Zhou wrote: > It seems that we may be converging on using hotspot namespace? Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1899516423 From jiangli at openjdk.org Fri Jan 19 02:11:29 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Fri, 19 Jan 2024 02:11:29 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 01:57:58 GMT, David Holmes wrote: > > It seems that we may be converging on using hotspot namespace? > > Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form. Ah, I see. Thanks for the clarification. I had an offline conversation with @iklam about the namespace and JEP topic during during last August JVM Language Summit. Based on the feedback from the conversion, it was not very clear if the namespace approach was broadly acceptable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1899527918 From dholmes at openjdk.org Fri Jan 19 04:28:29 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 19 Jan 2024 04:28:29 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Thu, 18 Jan 2024 15:17:27 GMT, Emanuel Peter wrote: > I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. > > Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). > > @coleenp wished that I do this separately, so I filed this RFE here. src/hotspot/share/code/nmethod.hpp line 624: > 622: // print output in opt build for disassembler library > 623: void print_relocations() PRODUCT_RETURN; > 624: void print_pcs() { print_pcs_on(tty); } This is a very common pattern, so I was wondering why you got rid of it? src/hotspot/share/interpreter/bytecodeTracer.cpp line 195: > 193: s.set_interval(from, to); > 194: > 195: ttyLocker ttyl; // keep the following output coherent The previous method using ttyLocker states: // The ttyLocker also prevents races between two threads // trying to use the single instance of BytecodePrinter. Is that also a concern in this method? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1458328787 PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1458329948 From dholmes at openjdk.org Fri Jan 19 04:28:31 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 19 Jan 2024 04:28:31 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 04:24:56 GMT, David Holmes wrote: >> I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. >> >> Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). >> >> @coleenp wished that I do this separately, so I filed this RFE here. > > src/hotspot/share/interpreter/bytecodeTracer.cpp line 195: > >> 193: s.set_interval(from, to); >> 194: >> 195: ttyLocker ttyl; // keep the following output coherent > > The previous method using ttyLocker states: > > // The ttyLocker also prevents races between two threads > // trying to use the single instance of BytecodePrinter. > > Is that also a concern in this method? FTR the code should not be assuming that `st` is `tty`! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1458330318 From epeter at openjdk.org Fri Jan 19 07:51:34 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 19 Jan 2024 07:51:34 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 04:22:23 GMT, David Holmes wrote: >> I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. >> >> Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). >> >> @coleenp wished that I do this separately, so I filed this RFE here. > > src/hotspot/share/code/nmethod.hpp line 624: > >> 622: // print output in opt build for disassembler library >> 623: void print_relocations() PRODUCT_RETURN; >> 624: void print_pcs() { print_pcs_on(tty); } > > This is a very common pattern, so I was wondering why you got rid of it? There is simply no use of `print_pcs` any more, I changed everything to `print_pcs_on`. So I thought I'd remove it, is that ok? Plus: `print_pcs` used to be the virtual method, and `print_pcs_on` only a local method, that was called by this override. But now I'd rather use `print_pcs_on` with a parameter outputStream, and so I made that one the virtual method. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1458512810 From epeter at openjdk.org Fri Jan 19 08:06:30 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 19 Jan 2024 08:06:30 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 04:25:45 GMT, David Holmes wrote: >> src/hotspot/share/interpreter/bytecodeTracer.cpp line 195: >> >>> 193: s.set_interval(from, to); >>> 194: >>> 195: ttyLocker ttyl; // keep the following output coherent >> >> The previous method using ttyLocker states: >> >> // The ttyLocker also prevents races between two threads >> // trying to use the single instance of BytecodePrinter. >> >> Is that also a concern in this method? > > FTR the code should not be assuming that `st` is `tty`! > Is that also a concern in this method? I don't think so, since `BytecodeTracer::print_method_codes` has its own local instance of `BytecodePrinter`, whereas `BytecodeTracer::trace_interpreter` uses the global instance `_interpreter_printer`. Using the `ttyLocker` for mutual exclusion on `_interpreter_printer` seems a bit ugly, and the comment seems to suggest as much. We can fix that in the future, if we want. > FTR the code should not be assuming that st is tty! Why are you saying that? I guess, yes, the code was already suspicios, since `st` was not guaranteed to be `tty`. But taking the `ttyLocker` was kinda ok if it was tty or now. Anyway, it is better if it is gone now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1458539949 From aboldtch at openjdk.org Fri Jan 19 08:40:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 08:40:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Thu, 18 Jan 2024 22:27:43 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - top load adjustments >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Fix type >> - Move inflated check in fast_locked >> - Move top load >> - 8319799: Recursive lightweight locking: x86 implementation >> - ... and 1 more: https://git.openjdk.org/jdk/compare/2adc7506...71c48af6 > > src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 120: > >> 118: // The owner may be anonymous and we removed the last obj entry in >> 119: // the lock-stack. This loses the information about the owner. >> 120: // Write the thread to the owner field so the runtime knows the owner. > > I'm confused by this comment. We get here if the monitor is inflated, so we didn't remove it from the lock stack. True. This comment was written when there was an explicit monitor check before the CAS that jumped to inflated. I am not sure if there is a situation where the owner is anonymous here now. It should be invariant that if a thread's lock stack does not contain the oop, performs an unlock/monitorexit, the monitor is inflated and the owner is not anonymous. At all places in the runtime when removing the oops from the lock stack the owner field is fixed. And in the emitted code the oop is pushed back to the lock stack incase of a failed unlock. There may be worth keeping this, and in the slow path after the CAS failed, check if it failed because of inflation, fix the owner field and jump back to the inflated fast path without transitioning to VM. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458579707 From aph at openjdk.org Fri Jan 19 08:45:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 19 Jan 2024 08:45:28 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: <-YBQsBIin04dUEAKg6A4v2LsNWDrkc1FAahWkyorEDQ=.6bb80567-3ffd-45e9-8c3d-505154c6b6b8@github.com> On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. So, if I may summarize: Some Arm software uses ISB as a spin pause, and some claim better performance in some cases, but we have no supporting data. At present, on Apple silicon, spin pause is a nop. Apple silicon is an in-house design, which speculates more than other AArch64 implementations, and has more to lose with an ISB. That doesn't mean that an ISB on Apple silicon is bad for the purpose, it's just that we don't know. I was hoping that we'd have an opportunity to do some experiments on contended spin locks to try some alternatives. I was also hoping that the PR to implement spin pause on some target would be a forcing function in that direction. YIELD, which is the instruction actually intended for this purpose, has been implemented by Arm as a nop, which is why we're looking for alternatives. WFET is another possibility. But "do nothing" is not a neutral position, even though we have no basis on which to make a decision.. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1899987120 From aph at openjdk.org Fri Jan 19 09:01:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 19 Jan 2024 09:01:28 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 02:09:15 GMT, Jiangli Zhou wrote: > > > It seems that we may be converging on using hotspot namespace? > > > > > > Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form. > > Ah, I see. Thanks for the clarification. I had an offline conversation with @iklam about the namespace and JEP topic during during last August JVM Language Summit. Based on the feedback from the conversion, it was not very clear if the namespace approach was broadly acceptable. Using a default namespace for everything could have bad effects on debugging and other maintenance tools. it's unlikely to be a low-cost option. Perhaps it could be restricted to the static linking case, but that further complicates testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1900008812 From aboldtch at openjdk.org Fri Jan 19 09:15:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 09:15:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Thu, 18 Jan 2024 15:47:52 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - top load adjustments >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Fix type >> - Move inflated check in fast_locked >> - Move top load >> - 8319799: Recursive lightweight locking: x86 implementation >> - ... and 1 more: https://git.openjdk.org/jdk/compare/3042a805...71c48af6 > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 968: > >> 966: >> 967: // Load the mark. >> 968: movptr(mark, Address(obj, oopDesc::mark_offset_in_bytes())); > > Can this potentially throw NPE? Like in the c1 case right? Maybe worth a comment if so. It doesn't look like the c2 code does anything special here like the c1 code does, except this goes to the signal handler, then throws the NPE when it continues. AFAIK C2 null checks are a part of the graph. I believe they can become explicit, elided or implicit deopt trap depending on control flow and profile data. obj should always be non-null here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458629652 From jsjolen at openjdk.org Fri Jan 19 09:41:36 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 19 Jan 2024 09:41:36 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v28] In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 06:50:49 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Untabify src/hotspot/os/linux/os_linux.cpp line 4403: > 4401: // Check the availability of MADV_POPULATE_WRITE. > 4402: UseMadvPopulateWrite = (::madvise(0, 0, MADV_POPULATE_WRITE) == 0); > 4403: What happens if the user sets `UseMadvPopulateWrite` to false when starting the JVM? It should not be used then. test/hotspot/gtest/runtime/test_os_linux.cpp line 407: > 405: } > 406: } > 407: Is there a reason that you want to use pthreads directly instead of using `TestThreadGroup`? Draft: ```c++ char* heap = os::reserve_memory(1 * G, false, mtInternal); size_t size = 1*G; auto pretouch = [&](Thread* c, int id)) { os::pretouch_memory(heap, heap + byte, os::vm_page_size()); }; auto use_memory = [&](Thread* c, int id) { int* iptr = reinterpret_cast(heap); for (int i = 0; i < 1000 && (size_t)i < (byte / (sizeof(int))); i++) *iptr++ = i; }; TestThreadGroup users_t(use_memory, 8); TestThreadGroup pretouch_t(pretouch, 1); users_t.doit(); pretouch_t.doit(); users_t.join(); pretouch_t.join(); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1458667877 PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1458665368 From aboldtch at openjdk.org Fri Jan 19 09:49:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 09:49:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Thu, 18 Jan 2024 15:56:07 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - top load adjustments >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Fix type >> - Move inflated check in fast_locked >> - Move top load >> - 8319799: Recursive lightweight locking: x86 implementation >> - ... and 1 more: https://git.openjdk.org/jdk/compare/39a27147...71c48af6 > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 977: > >> 975: jcc(Assembler::notZero, inflated); >> 976: >> 977: // Check if lock-stack is full. > > Why doesn't this call MacroAssembler::lightweight_lock here? The main idea is to keep it separate, where the MacroAssembler caters to the interpreters needs. The biggest difference is that the interpreter has one less registers, so it does some more juggling. And on a fundamental level the interpreter must handles unstructured locking, while C2 does not. There is no practical difference to the lock/enter logic because of this difference, but there is for the unlock/exit logic. At some level the split should be that we have one implementation that handles the fixed amount of registers and unstructured locking scenario (interpreter). And one implementation which handles the case where we have a register allocator and and assumes structured locking (C1, C2). The native wrapper is somewhere in-between, is structured, but maybe has to use the first implementation due to register pressure. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458680074 From aboldtch at openjdk.org Fri Jan 19 09:53:30 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 09:53:30 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Thu, 18 Jan 2024 20:55:56 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - top load adjustments >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Fix type >> - Move inflated check in fast_locked >> - Move top load >> - 8319799: Recursive lightweight locking: x86 implementation >> - ... and 1 more: https://git.openjdk.org/jdk/compare/a6c71235...71c48af6 > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1013: > >> 1011: // Recursive. >> 1012: increment(Address(tagged_monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions))); >> 1013: } > > This is sort of like the code above in fast_lock() but it's much clearer separated here. So I think this is good. > > This function passes in thread so works for 32 bits as well? Correct, they were merged here as the juggling of registers and the access to thread was the only difference between the implementations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458685680 From aboldtch at openjdk.org Fri Jan 19 10:00:31 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 10:00:31 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: Message-ID: > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 14 additional commits since the last revision: - Remove C2HandleAnonOMOwnerStub definitions on x86. - Add MFENCE comment - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - top load adjustments - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Fix type - ... and 4 more: https://git.openjdk.org/jdk/compare/1c7d7234...2c709241 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/71c48af6..2c709241 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=07-08 Stats: 11400 lines in 241 files changed: 7434 ins; 2960 del; 1006 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From aboldtch at openjdk.org Fri Jan 19 10:00:36 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 10:00:36 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Thu, 18 Jan 2024 21:58:36 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 14 additional commits since the last revision: >> >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - top load adjustments >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Fix type >> - ... and 4 more: https://git.openjdk.org/jdk/compare/1c7d7234...2c709241 > > src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 104: > >> 102: // continuation is the slow_path. >> 103: __ jmp(continuation()); >> 104: } > > It seems like two stubs small stubs might be better since they don't really share very much (?) rather than one with two entry points and two exit points. That sounds like a good idea. Will see how it ends up. > src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 130: > >> 128: __ movptr(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), NULL_WORD); >> 129: >> 130: // Fence. > > // Instead of MFENCE we use a dummy locked add of 0 to the top-of-stack. > Can you add this comment? Fixed. > src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 154: > >> 152: >> 153: #ifdef _LP64 >> 154: int C2HandleAnonOMOwnerStub::max_size() const { > > Do we need the C2HandleAnonOMOwnerStub anymore? Correct, not for x86_32/x86_64. Removed. > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1120: > >> 1118: xorptr(reg_rax, reg_rax); >> 1119: orptr(reg_rax, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions))); >> 1120: jcc(Assembler::notZero, check_successor); > > I don't know why the LP64/!LP64 paths are different. Do we not decrement recursions on 32 bit, and why wouldn't we? The idea was not to change the inflated unlocking in this PR. x86_32 does not handle recursions nor successor optimisation. I see no reason that they cannot be merged and just have 32bit use the 64bit logic. However the thinking was to keep that to a separate RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458694728 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458693292 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458693056 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458691485 From aboldtch at openjdk.org Fri Jan 19 10:07:28 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Jan 2024 10:07:28 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Fri, 19 Jan 2024 09:56:19 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 104: >> >>> 102: // continuation is the slow_path. >>> 103: __ jmp(continuation()); >>> 104: } >> >> It seems like two stubs small stubs might be better since they don't really share very much (?) rather than one with two entry points and two exit points. > > That sounds like a good idea. Will see how it ends up. It is a little bit awkward that they both have to restore the held monitor count. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1458708429 From tschatzl at openjdk.org Fri Jan 19 10:30:28 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 19 Jan 2024 10:30:28 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 22:55:30 GMT, Coleen Phillimore wrote: > Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. > > Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. Seems to be just like that suggestion in #16062. Feel free to ignore the other comment, at least I try to avoid extra files if not necessary. test/hotspot/jtreg/runtime/PrintingTests/SampleClass.java line 29: > 27: System.out.println("Hello from the sample class"); > 28: } > 29: } My recommendation to decrease the amount of noise would be to put this class into the other test file, and use `SampleClass.class.getname()` to use in `ProcessTools`. Avoids an extra file and an extra copyright header... ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17471#pullrequestreview-1832096672 PR Review Comment: https://git.openjdk.org/jdk/pull/17471#discussion_r1458744224 From jwaters at openjdk.org Fri Jan 19 12:11:33 2024 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 19 Jan 2024 12:11:33 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 Should I split the compiler upgrades into a different change and integrate that first? Going off the conversation in this thread it would seem like the compiler upgrade would benefit us a lot more than just having C++17 (The noreturn attribute is one big motivating factor for instance) and it might help if the compiler upgrades were not delayed by the discussion of when to jump to C++17 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1900300365 From rkennke at openjdk.org Fri Jan 19 13:02:35 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 19 Jan 2024 13:02:35 GMT Subject: Integrated: 8322383: G1: Only preserve marks on objects that are actually moved In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 16:20:07 GMT, Roman Kennke wrote: > The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. > The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. > > Testing: > - [x] hotspot_gc > - [x] tier1 > - [ ] tier2 This pull request has now been integrated. Changeset: 16be3888 Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/16be38887f878b508e22d491542765bf7e518f94 Stats: 41 lines in 7 files changed: 18 ins; 13 del; 10 mod 8322383: G1: Only preserve marks on objects that are actually moved Reviewed-by: ayang, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/17159 From pchilanomate at openjdk.org Fri Jan 19 14:00:27 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 19 Jan 2024 14:00:27 GMT Subject: RFR: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:09:39 GMT, Serguei Spitsyn wrote: >> The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. >> The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. >> The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. >> One detail to underline is the intrinsic implementation needs to use the argument #0 instead of #1. >> >> Testing: >> - The mach5 tiers 1-6 show no regressions > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge > - 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static Looks good to me. I'm confused about the description saying this method is a no-op if the current thread is a platform thread. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17298#pullrequestreview-1832725029 From coleenp at openjdk.org Fri Jan 19 14:03:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 14:03:30 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming. I don't know how bad "namespace hotspot" would be for debugging. At least for some of the common names. I suppose breakpoints would have to be specified in gdb as break at hotspot::Thread::is_owning_thread or something like that, and with a using namespace hotspot, it wouldn't be visible looking at the source code in that form. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1900475932 From ihse at openjdk.org Fri Jan 19 14:07:31 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 19 Jan 2024 14:07:31 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 Well, the only additional thing this PR does except raise the compiler version is to change the `--std` flag. It is a bit unclear what that means. For the JDK libraries, there are already code present that relies on C++17. For hotspot, what C++ constructions to use is strictly limited by the code standard document. As long as it does not mention any C++17 constructs, it does not really matter what the `--std` flag says. But, otoh, to be able to say something about C++17, we need first have proper support from all compilers. So I'd say just chill a bit, give folks some time to respond. My understanding of the situation is as follows: * Raising clang to 13.0 is uncontroversial * Raising xlc to 17.1.1.4 seems acceptable by the folks using it (I hope I got that right) * Raising gcc to 10.0 met some resistance. We could stop at gcc 9.0 for this PR (which is enough for C++17), and then continue discussing going to gcc 10.0 in a separate PR, or we can wait a bit more to see if @shipilev feels compelled by the arguments given in the discussion to accept going to 10. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1900481800 From coleenp at openjdk.org Fri Jan 19 14:20:43 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 14:20:43 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: > Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. > > Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fold SampleClass into printing test. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17471/files - new: https://git.openjdk.org/jdk/pull/17471/files/e08dfa30..0adf54c0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17471&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17471&range=00-01 Stats: 37 lines in 2 files changed: 6 ins; 30 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17471.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17471/head:pull/17471 PR: https://git.openjdk.org/jdk/pull/17471 From coleenp at openjdk.org Fri Jan 19 14:20:45 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 14:20:45 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 10:26:22 GMT, Thomas Schatzl wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fold SampleClass into printing test. > > test/hotspot/jtreg/runtime/PrintingTests/SampleClass.java line 29: > >> 27: System.out.println("Hello from the sample class"); >> 28: } >> 29: } > > My recommendation to decrease the amount of noise would be to put this class into the other test file, and use `SampleClass.class.getname()` to use in `ProcessTools`. > Avoids an extra file and an extra copyright header... Thanks for the suggestion, it's much better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17471#discussion_r1459075548 From ayang at openjdk.org Fri Jan 19 14:28:27 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 19 Jan 2024 14:28:27 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 14:20:43 GMT, Coleen Phillimore wrote: >> Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. >> >> Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fold SampleClass into printing test. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17471#pullrequestreview-1832830780 From alanb at openjdk.org Fri Jan 19 14:51:26 2024 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 19 Jan 2024 14:51:26 GMT Subject: RFR: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:09:39 GMT, Serguei Spitsyn wrote: >> The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. >> The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. >> The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. >> One detail to underline is the intrinsic implementation needs to use the argument #0 instead of #1. >> >> Testing: >> - The mach5 tiers 1-6 show no regressions > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge > - 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17298#pullrequestreview-1832909249 From eosterlund at openjdk.org Fri Jan 19 14:59:49 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 19 Jan 2024 14:59:49 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints Message-ID: ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. I have tested the changes from tier1-7, and run through full aurora performance tests. ------------- Commit messages: - 8322630: Remove ICStubs and related safepoints Changes: https://git.openjdk.org/jdk/pull/17495/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322630 Stats: 4077 lines in 137 files changed: 455 ins; 3099 del; 523 mod Patch: https://git.openjdk.org/jdk/pull/17495.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17495/head:pull/17495 PR: https://git.openjdk.org/jdk/pull/17495 From eosterlund at openjdk.org Fri Jan 19 14:59:49 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 19 Jan 2024 14:59:49 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Thanks to the OpenJDK port maintainers for picking this up! All added to contributors, hopefully. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1899847248 From avoitylov at openjdk.org Fri Jan 19 14:59:50 2024 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Fri, 19 Jan 2024 14:59:50 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:36:16 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Thanks to the OpenJDK port maintainers for picking this up! All added to contributors, hopefully. ARM32 passed relevant tests after the update. Thanks @fisk ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1900217933 From dnsimon at openjdk.org Fri Jan 19 14:59:52 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 19 Jan 2024 14:59:52 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. src/hotspot/share/runtime/vmStructs.cpp line 214: > 212: volatile_nonstatic_field(ArrayKlass, _higher_dimension, ObjArrayKlass*) \ > 213: volatile_nonstatic_field(ArrayKlass, _lower_dimension, ArrayKlass*) \ > 214: volatile_nonstatic_field(CompiledICData, _speculated_method, Method*) \ Please duplicate these `CompiledICData` declarations in `vmStructs_jvmci.cpp` so that they can be used by Graal to ascertain whether ICStubs are in use (Graal is still supporting multi JDK versions). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1458661257 From stuefe at openjdk.org Fri Jan 19 15:31:29 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 19 Jan 2024 15:31:29 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 14:20:43 GMT, Coleen Phillimore wrote: >> Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. >> >> Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fold SampleClass into printing test. Could there be a reasonable default rank? ------------- PR Review: https://git.openjdk.org/jdk/pull/17471#pullrequestreview-1833036253 From coleenp at openjdk.org Fri Jan 19 15:36:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 15:36:30 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 14:20:43 GMT, Coleen Phillimore wrote: >> Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. >> >> Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fold SampleClass into printing test. The default rank is nosafepoint-2, from before the G1 fix. G1 just needed a lower ranking. src/hotspot/share/utilities/concurrentHashTable.hpp line 410: > 408: size_t grow_hint = DEFAULT_GROW_HINT, > 409: bool enable_statistics = DEFAULT_ENABLE_STATISTICS, > 410: Mutex::Rank rank = Mutex::nosafepoint-2, This is the default rank. It seems reasonable to not stop for safepoints with the resizing lock (since it's the lock taken while in a safepoint - do_safepoint_scan). This was the default before it was lowered to service-1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17471#issuecomment-1900635547 PR Review Comment: https://git.openjdk.org/jdk/pull/17471#discussion_r1459210898 From stuefe at openjdk.org Fri Jan 19 15:56:29 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 19 Jan 2024 15:56:29 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 15:33:44 GMT, Coleen Phillimore wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fold SampleClass into printing test. > > src/hotspot/share/utilities/concurrentHashTable.hpp line 410: > >> 408: size_t grow_hint = DEFAULT_GROW_HINT, >> 409: bool enable_statistics = DEFAULT_ENABLE_STATISTICS, >> 410: Mutex::Rank rank = Mutex::nosafepoint-2, > > This is the default rank. It seems reasonable to not stop for safepoints with the resizing lock (since it's the lock taken while in a safepoint - do_safepoint_scan). This was the default before it was lowered to service-1. Ah. Sorry for missing it. I was thrown off by the explicit constructor below and the different order of arguments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17471#discussion_r1459240942 From coleenp at openjdk.org Fri Jan 19 16:13:33 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 16:13:33 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 15:53:31 GMT, Thomas Stuefe wrote: >> src/hotspot/share/utilities/concurrentHashTable.hpp line 410: >> >>> 408: size_t grow_hint = DEFAULT_GROW_HINT, >>> 409: bool enable_statistics = DEFAULT_ENABLE_STATISTICS, >>> 410: Mutex::Rank rank = Mutex::nosafepoint-2, >> >> This is the default rank. It seems reasonable to not stop for safepoints with the resizing lock (since it's the lock taken while in a safepoint - do_safepoint_scan). This was the default before it was lowered to service-1. > > Ah. Sorry for missing it. I was thrown off by the explicit constructor below and the different order of arguments. It took me while to find it from Thomas's original patch too. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17471#discussion_r1459266735 From coleenp at openjdk.org Fri Jan 19 16:56:42 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 16:56:42 GMT Subject: Integrated: 8323685: PrintSystemDictionaryAtExit has mutex rank assert In-Reply-To: References: Message-ID: <5V8VLunKw66_Nwn5Tbbq55ovVEN9MRvjZxOkiqX_O8I=.8e876a0c-c506-4c81-91ef-a6a968bbbd85@github.com> On Wed, 17 Jan 2024 22:55:30 GMT, Coleen Phillimore wrote: > Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. > > Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. This pull request has now been integrated. Changeset: 2865afe7 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/2865afe759fd5362abd0947fd4c1f5c8d3519ca3 Stats: 134 lines in 7 files changed: 76 ins; 45 del; 13 mod 8323685: PrintSystemDictionaryAtExit has mutex rank assert Co-authored-by: Thomas Schatzl Reviewed-by: tschatzl, ayang ------------- PR: https://git.openjdk.org/jdk/pull/17471 From coleenp at openjdk.org Fri Jan 19 16:56:40 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 16:56:40 GMT Subject: RFR: 8323685: PrintSystemDictionaryAtExit has mutex rank assert [v2] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 14:20:43 GMT, Coleen Phillimore wrote: >> Use variant 2 change from PR https://github.com/openjdk/jdk/pull/16062 that allows the ConcurrentHashTable to specify lock ranking for the resize lock at construction. SystemDictionary printing takes out both CHT resize lock and the ttyLocker. >> >> Tested with tier1-7, special stress test in tier7 to verify JDK-8317440 is still fixed. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fold SampleClass into printing test. Thanks Thomas and Albert for reviews, Thomas Stuefe for your comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17471#issuecomment-1900758459 From eastigeevich at openjdk.org Fri Jan 19 16:59:28 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 19 Jan 2024 16:59:28 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v3] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 09:17:35 GMT, Aleksey Shipilev wrote: >> We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. >> >> There are a few interesting conversions along the way: >> 1. `intptr_t` -> `uint32_t` (this method) >> 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) >> 3. `int32_t` -> `uint32_t` (in `emit_int32`) >> >> I believe these are safe after `is_uimm32` check, but please check (sic) me on this. >> >> Note that x86_64 matcher already does similar thing for immediates: >> >> >> // Long Immediate 32-bit unsigned >> operand immUL32() >> %{ >> predicate(n->get_long() == (unsigned int) (n->get_long())); >> match(ConL); >> ... >> %} >> >> instruct loadConUL32(rRegL dst, immUL32 src) >> %{ >> ... >> format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} >> ins_encode %{ >> __ movl($dst$$Register, $src$$constant); >> %} >> %} >> >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> >> Code sizes for `Hello World`, `-Xcomp`: >> >> >> # Before >> tier1 nmethod code size : 426208 bytes >> tier2 nmethod code size : 462880 bytes >> tier3 nmethod code size : 889992 bytes >> tier4 nmethod code size : 1244448 bytes >> >> # After >> tier1 nmethod code size : 425768 bytes (-0.1%) >> tier2 nmethod code size : 462400 bytes (-0.1%) >> tier3 nmethod code size : 882072 bytes (-0.8%) >> tier4 nmethod code size : 1236448 bytes (-0.6%) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Just do checked_cast" > > This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. lgtm ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/17343#pullrequestreview-1833261094 From sspitsyn at openjdk.org Fri Jan 19 18:43:38 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 19 Jan 2024 18:43:38 GMT Subject: RFR: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static [v2] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:09:39 GMT, Serguei Spitsyn wrote: >> The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. >> The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. >> The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. >> One detail to underline is the intrinsic implementation needs to use the argument #0 instead of #1. >> >> Testing: >> - The mach5 tiers 1-6 show no regressions > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge > - 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static Patricio and Alan, thank you a lot for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17298#issuecomment-1900919472 From sspitsyn at openjdk.org Fri Jan 19 18:43:39 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 19 Jan 2024 18:43:39 GMT Subject: Integrated: 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 07:55:45 GMT, Serguei Spitsyn wrote: > The notification method `VirtualThread.notifyJvmtiDisableSuspend` should be static. > The method disables/enables suspend of the current virtual thread, a no-op if the current thread is a platform thread. It is confusing for this to be an instance method, it should be static to make it clearer that it doesn't change the target thread. > The notification method `VirtualThread.notifyJvmtiHideFrames` also has to be static as it does not use/need the virtual thread `this` argument. > One detail to underline is the intrinsic implementation needs to use the argument #0 instead of #1. > > Testing: > - The mach5 tiers 1-6 show no regressions This pull request has now been integrated. Changeset: 8700de66 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/8700de66e45b526958c7a2923d43abe2a736d1d2 Stats: 14 lines in 5 files changed: 0 ins; 0 del; 14 mod 8322744: VirtualThread.notifyJvmtiDisableSuspend should be static Reviewed-by: pchilanomate, alanb ------------- PR: https://git.openjdk.org/jdk/pull/17298 From jiangli at openjdk.org Fri Jan 19 20:15:27 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Fri, 19 Jan 2024 20:15:27 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 08:58:29 GMT, Andrew Haley wrote: > > > > It seems that we may be converging on using hotspot namespace? > > > > > > > > > Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form. > > > > > > Ah, I see. Thanks for the clarification. I had an offline conversation with @iklam about the namespace and JEP topic during during last August JVM Language Summit. Based on the feedback from the conversion, it was not very clear if the namespace approach was broadly acceptable. > > Using a default namespace for everything could have bad effects on debugging and other maintenance tools. it's unlikely to be a low-cost option. Perhaps it could be restricted to the static linking case, but that further complicates testing. Thanks. Agreed to both points. It seems to add too much complexities if the namespace usage is restricted to static linking case only. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1901043889 From jiangli at openjdk.org Fri Jan 19 20:24:26 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Fri, 19 Jan 2024 20:24:26 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 14:00:58 GMT, Coleen Phillimore wrote: > You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming. Thanks, @coleenp. I think that could work for all different cases. I'll reflect that in this PR. For longer term we probably still want to find a cleaner solution when the static support becomes more popular. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1901057917 From coleenp at openjdk.org Fri Jan 19 21:42:51 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 21:42:51 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v11] In-Reply-To: References: Message-ID: <5RSaSdm9QKkooJ7aglk2T6mAAAxPD66-PFe3A8FjNG4=.65640760-0984-49d5-9184-e1eac5a920bd@github.com> On Mon, 15 Jan 2024 07:55:32 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: > > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319797 > - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 > - Avoid copy from and to the same location > - Fix typo > - Update unstructured unlock comment > - Fix bad indent after merge > - ... and 22 more: https://git.openjdk.org/jdk/compare/922f8e44...a4e372aa I only had a couple of minor comments but this makes sense. The code is clear about what it's doing. Thanks for writing the gtest. src/hotspot/share/runtime/lockStack.inline.hpp line 99: > 97: return false; > 98: } > 99: Maybe a more prominent comment about how you can only recursively enter a lock if the top is the same object? ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16606#pullrequestreview-1833767241 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1459819912 From coleenp at openjdk.org Fri Jan 19 21:42:52 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 21:42:52 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v11] In-Reply-To: <17VTTf6CmPVd-QeuKsCukTHBOqdkkr2erX83G_azDmg=.76519524-1ec9-45f4-8abb-f6848d09aa6a@github.com> References: <17VTTf6CmPVd-QeuKsCukTHBOqdkkr2erX83G_azDmg=.76519524-1ec9-45f4-8abb-f6848d09aa6a@github.com> Message-ID: On Mon, 13 Nov 2023 09:52:46 GMT, Roman Kennke wrote: >> All lightweight enters must check if the lock stack is full. Both push and try_recursive_enter have that as a pre condition. All code paths emitted C2, emitted shared code and the runtime does the is full check first. >> >> The reason that quick_enter does this without checking the mark word (for monitor) is that we go into the runtime from the emitted code if the lock stack is full. So we want to enter the runtime to inflate and make room, to not get into scenario where we have to go into the runtime on every monitor enter because we are locking on new objects in a loop with a full lock stack. > > FWIW, when I did the original LW-locking implementation, and when the lock-stack was not yet fixed-size, I experimented with an optimisation for this problem: instead of doing the check for every monitorenter, I let C2 analyse the maximum lock-stack depth that the method is going to require, and do the check (and growing of the lock-stack) at method-entry. However, I haven't seen a case where this was beneficial, and there have been several problems with the approach as well (maybe that's why it wasn't beneficial). But maybe it is worth revisiting at some point? OTOH, with recursive locking we need to load and check the top-offset anyway, which makes the extra cost to check for overflow even smaller. Isn't checking that the lock stack is full a prerequisite for trying to recursively enter the lock? ie, we have to check first. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1459802032 From coleenp at openjdk.org Fri Jan 19 21:42:51 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 21:42:51 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v11] In-Reply-To: References: Message-ID: On Tue, 14 Nov 2023 07:46:56 GMT, David Holmes wrote: >> There is probably more nuance here w.r.t. `offsetof` than I know. >> My belief was that reason we did not use `offsetof` is because we use it on non standard layout types, for which is invalid. But the lock stack is a standard layout. >> >> However, reading some of issues surrounding `offsetof` (mainly poor compiler support and becoming conditionally supported in C++17) there might be more reasons to avoid it. If that is the case this property would have to be asserted at runtime instead. >> >> Maybe @kimbarrett has some more insight. > > To be clear I was querying the use of `std::is_standard_layout` here. https://en.cppreference.com/w/cpp/language/classes#Standard-layout_class TIL. Maybe this should be in our allowed list of features? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1459799972 From coleenp at openjdk.org Fri Jan 19 21:42:53 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 21:42:53 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v6] In-Reply-To: References: <9qMIC_BQk5i5MmbQLovTmNsla_qMxlgCCZhyK8eHHSc=.c7d958c4-deaa-4264-a3f4-1907240d26d2@github.com> <25xmq9JFWlz_2L3NXWK8ghR5N2UlPE-uELjIZoyDvyg=.40c3316f-2580-4b2b-ad65-ec649e6a1c0a@github.com> Message-ID: On Thu, 23 Nov 2023 07:38:49 GMT, Axel Boldt-Christmas wrote: >> Update: I didn't see an explicit test case for overflowing the lock stack. >> Do you plan to add one? > > Sure. They are generally tricky, you usually need to tell the compiler not to do stuff like eliding locks, and inline methods. I will attempt to write tests both for -Xint and the compiler that tests both recursive and normal full lock stack. (The recursive tests requires the platform to implement recursive lightweight for it to effectively test anything) I wrote a little test case just now. Maybe deciding which object should cause inflation can be tuned with some benchmarking results if we find some degenerate case? Maybe the top lock should be inflated since that might be the one that causes the stack to overflow? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1459807878 From coleenp at openjdk.org Fri Jan 19 21:42:54 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Jan 2024 21:42:54 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v6] In-Reply-To: References: <9qMIC_BQk5i5MmbQLovTmNsla_qMxlgCCZhyK8eHHSc=.c7d958c4-deaa-4264-a3f4-1907240d26d2@github.com> <25xmq9JFWlz_2L3NXWK8ghR5N2UlPE-uELjIZoyDvyg=.40c3316f-2580-4b2b-ad65-ec649e6a1c0a@github.com> Message-ID: On Fri, 19 Jan 2024 21:30:35 GMT, Coleen Phillimore wrote: >> Sure. They are generally tricky, you usually need to tell the compiler not to do stuff like eliding locks, and inline methods. I will attempt to write tests both for -Xint and the compiler that tests both recursive and normal full lock stack. (The recursive tests requires the platform to implement recursive lightweight for it to effectively test anything) > > I wrote a little test case just now. Maybe deciding which object should cause inflation can be tuned with some benchmarking results if we find some degenerate case? Maybe the top lock should be inflated since that might be the one that causes the stack to overflow? In any case, maybe there should be a logging statement here for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1459808523 From kvn at openjdk.org Fri Jan 19 22:09:45 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Jan 2024 22:09:45 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization Message-ID: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. Tested tier1-3, scope, stress. No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. ------------- Commit messages: - 8324050: Issue store-store barrier after re-materializing objects during deoptimization Changes: https://git.openjdk.org/jdk/pull/17503/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17503&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324050 Stats: 11 lines in 1 file changed: 4 ins; 6 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17503.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17503/head:pull/17503 PR: https://git.openjdk.org/jdk/pull/17503 From dlong at openjdk.org Fri Jan 19 23:05:39 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 19 Jan 2024 23:05:39 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Fri, 19 Jan 2024 22:00:15 GMT, Vladimir Kozlov wrote: > Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. > > I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. > > Tested tier1-3, scope, stress. > > No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. src/hotspot/share/runtime/deoptimization.cpp line 1603: > 1601: // We need barrier so that stores that initialize these objects can't be reordered > 1602: // with subsequent stores that make these objects accessible by other threads. > 1603: OrderAccess::storestore(); This seems like the right place for normal deoptimization, but I'm worried that EscapeBarrier::deoptimize_objects() makes these objects visible to JVMTI without calling reassign_fields(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17503#discussion_r1459954889 From kvn at openjdk.org Sat Jan 20 00:29:29 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 20 Jan 2024 00:29:29 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Fri, 19 Jan 2024 23:02:31 GMT, Dean Long wrote: >> Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. >> >> I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. >> >> Tested tier1-3, scope, stress. >> >> No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. > > src/hotspot/share/runtime/deoptimization.cpp line 1603: > >> 1601: // We need barrier so that stores that initialize these objects can't be reordered >> 1602: // with subsequent stores that make these objects accessible by other threads. >> 1603: OrderAccess::storestore(); > > This seems like the right place for normal deoptimization, but I'm worried that EscapeBarrier::deoptimize_objects() makes these objects visible to JVMTI without calling reassign_fields(). I see that `EscapeBarrier::deoptimize_objects_internal()` calls `Deoptimization::deoptimize_objects_internal()` which calls `rematerialize_objects()` which does reallocation and fields reassignment. It will execute this barrier. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17503#discussion_r1460077603 From dlong at openjdk.org Sat Jan 20 08:12:33 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 20 Jan 2024 08:12:33 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Fri, 19 Jan 2024 22:00:15 GMT, Vladimir Kozlov wrote: > Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. > > I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. > > Tested tier1-3, scope, stress. > > No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17503#pullrequestreview-1834642502 From dlong at openjdk.org Sat Jan 20 08:12:35 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 20 Jan 2024 08:12:35 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Sat, 20 Jan 2024 00:26:17 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/runtime/deoptimization.cpp line 1603: >> >>> 1601: // We need barrier so that stores that initialize these objects can't be reordered >>> 1602: // with subsequent stores that make these objects accessible by other threads. >>> 1603: OrderAccess::storestore(); >> >> This seems like the right place for normal deoptimization, but I'm worried that EscapeBarrier::deoptimize_objects() makes these objects visible to JVMTI without calling reassign_fields(). > > I see that `EscapeBarrier::deoptimize_objects_internal()` calls `Deoptimization::deoptimize_objects_internal()` which calls `rematerialize_objects()` which does reallocation and fields reassignment. It will execute this barrier. Thanks, I missed that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17503#discussion_r1460290984 From aph at openjdk.org Sat Jan 20 18:02:26 2024 From: aph at openjdk.org (Andrew Haley) Date: Sat, 20 Jan 2024 18:02:26 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: <_v-SmPJKuKfADRbcVs9bxD-oVlXlMY6bi1lsapciOhQ=.ee0d368c-0d55-4c43-99fc-565577714792@github.com> On Fri, 19 Jan 2024 20:21:21 GMT, Jiangli Zhou wrote: > > You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming. > > Thanks, @coleenp. I think that could work for all different cases. I'll reflect that in this PR. > > For longer term we probably still want to find a cleaner solution when the static support becomes more popular. I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1902207127 From kbarrett at openjdk.org Sat Jan 20 18:10:35 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 20 Jan 2024 18:10:35 GMT Subject: RFR: 8324240: Remove unused GrowableArrayView::EMPTY Message-ID: Please review this trivial change to remove an unused variable. The variable was introduced by [JDK-8254231](https://bugs.openjdk.org/browse/JDK-8254231). All uses were removed by [JDK-8283689](https://bugs.openjdk.org/browse/JDK-8283689). Testing: mach5 tier1 ------------- Commit messages: - remove unused variable Changes: https://git.openjdk.org/jdk/pull/17507/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17507&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324240 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17507.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17507/head:pull/17507 PR: https://git.openjdk.org/jdk/pull/17507 From qamai at openjdk.org Sat Jan 20 18:42:26 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Sat, 20 Jan 2024 18:42:26 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Fri, 19 Jan 2024 22:00:15 GMT, Vladimir Kozlov wrote: > Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. > > I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. > > Tested tier1-3, scope, stress. > > No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. This seems similar to [a recent discussion](https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071521.html). There, it is decided that a release barrier would be safer. Should we do it similarly here? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17503#issuecomment-1902236204 From dcubed at openjdk.org Sat Jan 20 19:13:25 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Sat, 20 Jan 2024 19:13:25 GMT Subject: RFR: 8324240: Remove unused GrowableArrayView::EMPTY In-Reply-To: References: Message-ID: On Sat, 20 Jan 2024 18:05:42 GMT, Kim Barrett wrote: > Please review this trivial change to remove an unused variable. > > The variable was introduced by [JDK-8254231](https://bugs.openjdk.org/browse/JDK-8254231). All uses were removed by [JDK-8283689](https://bugs.openjdk.org/browse/JDK-8283689). > > Testing: mach5 tier1 Thumbs up. I agree this is a trivial fix. Thanks for documenting your Tier1 testing. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17507#pullrequestreview-1834720296 From kbarrett at openjdk.org Sun Jan 21 02:31:34 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 21 Jan 2024 02:31:34 GMT Subject: Integrated: 8324240: Remove unused GrowableArrayView::EMPTY In-Reply-To: References: Message-ID: On Sat, 20 Jan 2024 18:05:42 GMT, Kim Barrett wrote: > Please review this trivial change to remove an unused variable. > > The variable was introduced by [JDK-8254231](https://bugs.openjdk.org/browse/JDK-8254231). All uses were removed by [JDK-8283689](https://bugs.openjdk.org/browse/JDK-8283689). > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: a474b372 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/a474b37212da5edbd5868c9157aff90aae00ca50 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod 8324240: Remove unused GrowableArrayView::EMPTY Reviewed-by: dcubed ------------- PR: https://git.openjdk.org/jdk/pull/17507 From kbarrett at openjdk.org Sun Jan 21 02:31:33 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 21 Jan 2024 02:31:33 GMT Subject: RFR: 8324240: Remove unused GrowableArrayView::EMPTY In-Reply-To: References: Message-ID: On Sat, 20 Jan 2024 19:11:07 GMT, Daniel D. Daugherty wrote: >> Please review this trivial change to remove an unused variable. >> >> The variable was introduced by [JDK-8254231](https://bugs.openjdk.org/browse/JDK-8254231). All uses were removed by [JDK-8283689](https://bugs.openjdk.org/browse/JDK-8283689). >> >> Testing: mach5 tier1 > > Thumbs up. I agree this is a trivial fix. > > Thanks for documenting your Tier1 testing. Thanks for review, @dcubed-ojdk . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17507#issuecomment-1902481847 From jwaters at openjdk.org Sun Jan 21 07:19:18 2024 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 21 Jan 2024 07:19:18 GMT Subject: RFR: 8316930: HotSpot should use noexcept instead of throw() [v4] In-Reply-To: References: Message-ID: > throw() has been deprecated since C++11 alongside dynamic exception specifications, we should replace all instances of it with noexcept to prepare HotSpot for later versions of C++ Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'openjdk:master' into noexcept - Typo in GensrcAdlc.gmk - Merge branch 'openjdk:master' into noexcept - Merge branch 'master' into noexcept - ic in compiledIC.hpp - Revert compiledIC.cpp - Revert compiledIC.hpp - Partially Revert parse.hpp - Merge branch 'master' into noexcept - Merge branch 'master' into noexcept - ... and 3 more: https://git.openjdk.org/jdk/compare/a474b372...0d2fe966 ------------- Changes: https://git.openjdk.org/jdk/pull/15910/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15910&range=03 Stats: 88 lines in 38 files changed: 0 ins; 0 del; 88 mod Patch: https://git.openjdk.org/jdk/pull/15910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15910/head:pull/15910 PR: https://git.openjdk.org/jdk/pull/15910 From kbarrett at openjdk.org Sun Jan 21 07:35:38 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 21 Jan 2024 07:35:38 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() Message-ID: Please review this change to use OopHandle::is_empty() rather than comparing the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former is the intended API for such checks. ptr_raw should only be used directly where it is actually needed. Testing: mach5 tier1. ------------- Commit messages: - prefer is_empty Changes: https://git.openjdk.org/jdk/pull/17510/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17510&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324242 Stats: 8 lines in 3 files changed: 0 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17510.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17510/head:pull/17510 PR: https://git.openjdk.org/jdk/pull/17510 From jwaters at openjdk.org Sun Jan 21 12:18:40 2024 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 21 Jan 2024 12:18:40 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions I'm an entire year late, but if poisoning NULL is desired, what about #pragma GCC poison? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1902610611 From jwaters at openjdk.org Sun Jan 21 12:21:46 2024 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 21 Jan 2024 12:21:46 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions For Visual C++, that would be #pragma deprecated("NULL") To quote Microsoft: "You can deprecate macro names. Place the macro name in quotes or else macro expansion will occur." I have no idea how to achieve this with the xlc compiler ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1902611263 From duke at openjdk.org Mon Jan 22 02:03:41 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 22 Jan 2024 02:03:41 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier Message-ID: Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java Run with ParallelGC to minimalize impact of gc barrier. make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" ... FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s Without the patch FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s ------------- Commit messages: - 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier Changes: https://git.openjdk.org/jdk/pull/17511/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324186 Stats: 102 lines in 3 files changed: 92 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/17511.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17511/head:pull/17511 PR: https://git.openjdk.org/jdk/pull/17511 From duke at openjdk.org Mon Jan 22 03:07:35 2024 From: duke at openjdk.org (Liming Liu) Date: Mon, 22 Jan 2024 03:07:35 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v28] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 09:38:56 GMT, Johan Sj?len wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Untabify > > src/hotspot/os/linux/os_linux.cpp line 4403: > >> 4401: // Check the availability of MADV_POPULATE_WRITE. >> 4402: UseMadvPopulateWrite = (::madvise(0, 0, MADV_POPULATE_WRITE) == 0); >> 4403: > > What happens if the user sets `UseMadvPopulateWrite` to false when starting the JVM? It should not be used then. It is before Arguments::parse, so I would use FLAG_SET_DEFAULT here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1461306428 From fyang at openjdk.org Mon Jan 22 04:02:28 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 22 Jan 2024 04:02:28 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 01:58:32 GMT, kuaiwei wrote: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. Before: Benchmark Mode Cnt Score Error Units FinalFieldInitialize.testAllocWithFinal thrpt 9 840.267 ? 69.505 ops/s After: Benchmark Mode Cnt Score Error Units FinalFieldInitialize.testAllocWithFinal thrpt 9 732.791 ? 47.198 ops/s ------------- PR Review: https://git.openjdk.org/jdk/pull/17511#pullrequestreview-1835605044 From mdoerr at openjdk.org Mon Jan 22 05:04:28 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 22 Jan 2024 05:04:28 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Test results are good on all SAP supported platforms (CPU: x86_64, aarch64, PPC64; OS: linux, Windows, AIX). Performance looks also good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1903259665 From duke at openjdk.org Mon Jan 22 06:39:52 2024 From: duke at openjdk.org (Liming Liu) Date: Mon, 22 Jan 2024 06:39:52 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v29] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with two additional commits since the last revision: - Use TestThreadGroup - Set it as default before parsing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/af5c5dc5..55946581 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=28 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=27-28 Stats: 51 lines in 2 files changed: 1 ins; 31 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From jiangli at openjdk.org Mon Jan 22 06:47:26 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 22 Jan 2024 06:47:26 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: <_v-SmPJKuKfADRbcVs9bxD-oVlXlMY6bi1lsapciOhQ=.ee0d368c-0d55-4c43-99fc-565577714792@github.com> References: <_v-SmPJKuKfADRbcVs9bxD-oVlXlMY6bi1lsapciOhQ=.ee0d368c-0d55-4c43-99fc-565577714792@github.com> Message-ID: On Sat, 20 Jan 2024 18:00:15 GMT, Andrew Haley wrote: > > > You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming. > > > > > > Thanks, @coleenp. I think that could work for all different cases. I'll reflect that in this PR. > > For longer term we probably still want to find a cleaner solution when the static support becomes more popular. > > I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export. We also discussed about `objcopy` in https://github.com/openjdk/jdk/pull/14808#issuecomment-1631597197 and https://github.com/openjdk/jdk/pull/14808#issuecomment-1631611220. My main concern was the portability of `objcopy` approach. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1903359606 From jbhateja at openjdk.org Mon Jan 22 07:11:30 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 22 Jan 2024 07:11:30 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into indexof > - Merge branch 'openjdk:master' into indexof > - Addressing review comments. > - Fix for JDK-8321599 > - Support UU IndexOf > - Only use optimization when EnableX86ECoreOpts is true > - Fix whitespace > - Merge branch 'openjdk:master' into indexof > - Comments; added exhaustive-ish test > - Subtracting 0x10 twice. > - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 505: > 503: __ cmpb(Address(rbx, r15, Address::times_1, -0xa), rax); > 504: __ jne(L_top_loop_1); > 505: __ jmp(L_0x406019); Instead of having special handling for each tail size (3 - 31 bytes), can we directly use 32 bytes VMASKMOVPS with appropriate mask for different tail sizes and only residual part (0 - 3 bytes) can fall over to scalar tail. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1461424231 From jbhateja at openjdk.org Mon Jan 22 07:11:31 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 22 Jan 2024 07:11:31 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 07:05:56 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: >> >> - Merge branch 'openjdk:master' into indexof >> - Merge branch 'openjdk:master' into indexof >> - Addressing review comments. >> - Fix for JDK-8321599 >> - Support UU IndexOf >> - Only use optimization when EnableX86ECoreOpts is true >> - Fix whitespace >> - Merge branch 'openjdk:master' into indexof >> - Comments; added exhaustive-ish test >> - Subtracting 0x10 twice. >> - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2 > > src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 505: > >> 503: __ cmpb(Address(rbx, r15, Address::times_1, -0xa), rax); >> 504: __ jne(L_top_loop_1); >> 505: __ jmp(L_0x406019); > > Instead of having special handling for each tail size (3 - 31 bytes), can we directly use 32 bytes VMASKMOVPS with appropriate mask for different tail sizes and only residual part (0 - 3 bytes) can fall over to scalar tail. Basically tail size can be rounded to nearest multiple of doubleword. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1461425962 From stuefe at openjdk.org Mon Jan 22 07:14:25 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 22 Jan 2024 07:14:25 GMT Subject: RFR: JDK-8322475: Extend printing for System.map In-Reply-To: References: Message-ID: <_eB1CnceF9tAr5UNL85Nm29RMAauKricxnbrPb9r9Q0=.38a3970b-bdb5-4298-9361-9d184f99105d@github.com> On Tue, 19 Dec 2023 15:48:58 GMT, Thomas Stuefe wrote: > This is an expansion on the new `System.map` command introduced with JDK-8318636. > > We now print valuable information per memory region, such as: > > - the actual resident set size > - the actual number of huge pages > - the actual used page size > - the THP state of the region (was advised, is eligible, uses THP, ...) > - whether the region is shared > - whether the region had been committed (backed by swap) > - whether the region has been swapped out. > > Example output: > > > from to size rss hugetlb pgsz prot notes vm info/file > 0x00000000c0000000 - 0x00000000ffe00000 1071644672 0 4194304 2M rw-p huge JAVAHEAP /anon_hugepage > 0x00000000ffe00000 - 0x0000000100000000 2097152 0 0 2M rw-p huge JAVAHEAP /anon_hugepage > 0x0000558016b67000 - 0x0000558016b68000 4096 4096 0 4K r--p /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java > 0x0000558016b68000 - 0x0000558016b69000 4096 4096 0 4K r-xp /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java > 0x00007f3a749f2000 - 0x00007f3a74c62000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'profiled nmethods') > 0x00007f3a74c62000 - 0x00007f3a7be51000 119468032 0 0 4K ---p nores CODE(CodeHeap 'profiled nmethods') > 0x00007f3a7be51000 - 0x00007f3a7c1c1000 3604480 3604480 0 4K rwxp CODE(CodeHeap 'profiled nmethods') > 0x00007f3a7c1c1000 - 0x00007f3a7c592000 4001792 0 0 4K ---p nores CODE(CodeHeap 'non-nmethods') > 0x00007f3a7c592000 - 0x00007f3a7c802000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'non-profiled nmethods') > 0x00007f3a7c802000 - 0x00007f3a839f200... Keep open, bot ------------- PR Comment: https://git.openjdk.org/jdk/pull/17158#issuecomment-1903386148 From stuefe at openjdk.org Mon Jan 22 07:24:28 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 22 Jan 2024 07:24:28 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 08:13:55 GMT, Thomas Stuefe wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Add specific percentage switch not yet ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1903396425 From rehn at openjdk.org Mon Jan 22 07:40:28 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 22 Jan 2024 07:40:28 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. I believe this was the last major piece in task we started over 5y ago of removing runtime safepoint and latencies. Or as some might say, we finally have a runtime good enough to run ZGC ;) (and Shenandoah). Thank you @fisk for completing this milestone! Risc-v passes my testing. (vf2 (t1) + qemu (t1-2), ran twice, once on v3 branch and once this pr) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1903414662 From yyang at openjdk.org Mon Jan 22 08:28:33 2024 From: yyang at openjdk.org (Yi Yang) Date: Mon, 22 Jan 2024 08:28:33 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 08:13:55 GMT, Thomas Stuefe wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Add specific percentage switch Great work ? I've seen too many stories where JVM/JNI memory leaks occur and are almost tracelessly terminated by Linux's OOM killer. Having the VM recognize and log this termination process can leave us with many clues for troubleshooting ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1903478539 From shade at openjdk.org Mon Jan 22 08:33:27 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 Jan 2024 08:33:27 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:48:23 GMT, Aleksey Shipilev wrote: > Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. > > I provisionally call this flag `external-dep`, but I am open for other suggestions. > > Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. > > Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. > > Additional testing: > - [x] `make test TEST=applications/` fails > - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests Any takers? Maybe the audience should include core-libs too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1903486053 From kbarrett at openjdk.org Mon Jan 22 09:02:41 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 22 Jan 2024 09:02:41 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions I don't think `#pragma GCC poison` works for us. It would complain about a system or library header that uses NULL and is included after the pragma. MSVC's deprecation pragma might work for this, at least for shared and Windows-specific code. I couldn't find a way to use it for FORBID_C_FUNCTION, but the problems I encountered for that don't seem applicable in this case. However, there are still a lot of NULL's left. All of the per-cpu .ad files, and the jvmtiXXX.xsl files contain NULL's that will appear in the associated generated code. Also, NULL usage in gtests doesn't seem to have been addressed yet. But it does look like there's been a bit of backsliding: https://bugs.openjdk.org/browse/JDK-8324286 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1903531104 From jwaters at openjdk.org Mon Jan 22 09:28:46 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 22 Jan 2024 09:28:46 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: <3ok9vLTR6ElUchHl9NGXoJgz_EGG9Fln3cE4BgPWGrw=.f1c53a24-9835-4e7a-a62b-6fffdfd1e05f@github.com> On Mon, 22 Jan 2024 08:59:30 GMT, Kim Barrett wrote: > I don't think `#pragma GCC poison` works for us. It would complain about a > system or library header that uses NULL and is included after the pragma. I see, that's a shame in that case > However, there are still a lot of NULL's left. All of the per-cpu .ad files, and the jvmtiXXX.xsl files contain NULL's that will appear in the associated generated code. Also, NULL usage in gtests doesn't seem to have been addressed yet. > > But it does look like there's been a bit of backsliding: https://bugs.openjdk.org/browse/JDK-8324286 That's a little worrying, maybe the Style Guide's NULL section could be reworded since now most usages of NULL are nullptr? The .ad files and .xsl files are a bit of a problem though > MSVC's deprecation pragma might work for this, at least for shared and > Windows-specific code. I couldn't find a way to use it for FORBID_C_FUNCTION, > but the problems I encountered for that don't seem applicable in this case. The deprecation pragma only works for macros and identifiers, it doesn't accept method signatures and would warn for every time a identifier is used, even in the method's declaration itself! Probably can't be used in FORBID_C_FUNCTION as mentioned above, but sounds good for a macro like NULL I've also been trying to implement FORBID_C_FUNCTION and ALLOW_C_FUNCTION portably, speaking of it, but it hasn't been going great so far :/ https://github.com/openjdk/jdk/pull/17387 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1903579473 From jsjolen at openjdk.org Mon Jan 22 09:45:27 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Jan 2024 09:45:27 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: On Sun, 21 Jan 2024 07:29:45 GMT, Kim Barrett wrote: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. LGTM and trivial. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1903607437 From shade at openjdk.org Mon Jan 22 09:50:26 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 Jan 2024 09:50:26 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: <6P3xTF6Fklma9FNk0A3b8sqv6MGxmPEnoeCLZ_fAiwY=.84bce8bf-6a17-4950-bab7-cf0f3d0254f2@github.com> On Fri, 19 Jan 2024 22:00:15 GMT, Vladimir Kozlov wrote: > Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. > > I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. > > Tested tier1-3, scope, stress. > > No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. It looks good, but let's not put unrelated changes together? I think the `COMPILER2_OR_JVMCI` should come in as a separate atomic change. This will, for example, allow to cleanly backport `storestore` additions without looking back whether the vector support enablement hunks make sense. ------------- PR Review: https://git.openjdk.org/jdk/pull/17503#pullrequestreview-1836050278 From ngasson at openjdk.org Mon Jan 22 10:03:29 2024 From: ngasson at openjdk.org (Nick Gasson) Date: Mon, 22 Jan 2024 10:03:29 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 01:58:32 GMT, kuaiwei wrote: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s Could we instead make the last store to a final field in a constructor an STLR and remove the release barrier? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1903640215 From aph at openjdk.org Mon Jan 22 10:17:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Jan 2024 10:17:28 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 01:58:32 GMT, kuaiwei wrote: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s Very nice. Thanks. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2069: > 2067: if (last != nullptr && nativeInstruction_at(last)->is_Membar() && prev == last) { > 2068: NativeMembar *bar = NativeMembar_at(prev); > 2069: // We need avoid promoting barrier to dmb.ish, Suggestion: // Don't promote DMB ST|DMB LD to DMB (a full barrier) because // doing so would introduce a StoreLoad which the caller did not // intend. I think that should be clear enough. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17511#pullrequestreview-1836113464 PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1461628025 From aph at openjdk.org Mon Jan 22 10:17:30 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Jan 2024 10:17:30 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:00:25 GMT, Nick Gasson wrote: > Could we instead make the last store to a final field in a constructor an STLR and remove the release barrier? It's possible, but it would be more work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1903665260 From aph at openjdk.org Mon Jan 22 10:21:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Jan 2024 10:21:28 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 03:59:33 GMT, Fei Yang wrote: > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. That's interesting, but I guess it's only 14% on a microbenchmark which is intended to stress this as much as possible. Do we have any idea why this regresses? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1903671810 From dholmes at openjdk.org Mon Jan 22 10:32:34 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 22 Jan 2024 10:32:34 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 07:46:13 GMT, Emanuel Peter wrote: >> src/hotspot/share/code/nmethod.hpp line 624: >> >>> 622: // print output in opt build for disassembler library >>> 623: void print_relocations() PRODUCT_RETURN; >>> 624: void print_pcs() { print_pcs_on(tty); } >> >> This is a very common pattern, so I was wondering why you got rid of it? > > There is simply no use of `print_pcs` any more, I changed everything to `print_pcs_on`. > So I thought I'd remove it, is that ok? > Plus: `print_pcs` used to be the virtual method, and `print_pcs_on` only a local method, that was called by this override. But now I'd rather use `print_pcs_on` with a parameter outputStream, and so I made that one the virtual method. It is kind of undoing a well established pattern/convention though. We could replace all `print_foo()` with `print_foo_on(tty)` and so render `print_foo()` unnecessary - but that defeats the whole point of the pattern. Can't comment on the virtual/non-virtual aspect: not sure what the right/best arrangement is for that. Anyway, not requesting any changes, just curious about the motivation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1461655725 From dholmes at openjdk.org Mon Jan 22 10:36:34 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 22 Jan 2024 10:36:34 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Thu, 18 Jan 2024 15:17:27 GMT, Emanuel Peter wrote: > I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. > > Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). > > @coleenp wished that I do this separately, so I filed this RFE here. Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17486#pullrequestreview-1836152658 From dholmes at openjdk.org Mon Jan 22 10:36:35 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 22 Jan 2024 10:36:35 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 08:03:52 GMT, Emanuel Peter wrote: >> FTR the code should not be assuming that `st` is `tty`! > >> Is that also a concern in this method? > > I don't think so, since `BytecodeTracer::print_method_codes` has its own local instance of `BytecodePrinter`, whereas `BytecodeTracer::trace_interpreter` uses the global instance `_interpreter_printer`. > > Using the `ttyLocker` for mutual exclusion on `_interpreter_printer` seems a bit ugly, and the comment seems to suggest as much. We can fix that in the future, if we want. > >> FTR the code should not be assuming that st is tty! > > Why are you saying that? I guess, yes, the code was already suspicios, since `st` was not guaranteed to be `tty`. But taking the `ttyLocker` was kinda ok if it was tty or now. Anyway, it is better if it is gone now. I was just pointing out that the code was "wrong" to assume `st` had to be the tty - though as you point out it is harmless even if not useful. Thanks for clarifying that we don't need the ttyLocker for mutual exclusion here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17486#discussion_r1461660805 From kbarrett at openjdk.org Mon Jan 22 10:44:38 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 22 Jan 2024 10:44:38 GMT Subject: RFR: 8324286: Fix backsliding on use of nullptr instead of NULL Message-ID: Please review this change that removes some new (since JDK-8299837) uses of NULL in HotSpot code. Most of the changes are in comments, replacing "NULL" with either "nullptr" (for code snippets) or "null" (for textual description), as was done for JDK-8299837. There are a small number of new uses of NULL in code, which are replaced with nullptr. Testing: mach5 tier1 ------------- Commit messages: - fix backsliding on NULL usage Changes: https://git.openjdk.org/jdk/pull/17516/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17516&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324286 Stats: 19 lines in 9 files changed: 0 ins; 0 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/17516.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17516/head:pull/17516 PR: https://git.openjdk.org/jdk/pull/17516 From epeter at openjdk.org Mon Jan 22 10:44:43 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 22 Jan 2024 10:44:43 GMT Subject: RFR: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:33:42 GMT, David Holmes wrote: >> I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. >> >> Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). >> >> @coleenp wished that I do this separately, so I filed this RFE here. > > Marked as reviewed by dholmes (Reviewer). Thanks @dholmes-ora @iwanowww @vnkozlov for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17486#issuecomment-1903708335 From epeter at openjdk.org Mon Jan 22 10:44:44 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 22 Jan 2024 10:44:44 GMT Subject: Integrated: 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 In-Reply-To: References: Message-ID: On Thu, 18 Jan 2024 15:17:27 GMT, Emanuel Peter wrote: > I'm removing some instances of `ttyLocker`. Instead of locking, I first put all the output on a `stringStream`, and then print this stream all at once, which is atomic. > > Removing the `ttyLocker` is nice, because it means we have less interference with other locking mechanisms, such as the `extra_data_lock` cases I have to introduce with [JDK-8306767](https://bugs.openjdk.org/browse/JDK-8306767). > > @coleenp wished that I do this separately, so I filed this RFE here. This pull request has now been integrated. Changeset: c84af493 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/c84af4938647efbc2d6c94efef748446bf6d50b4 Stats: 73 lines in 11 files changed: 14 ins; 2 del; 57 mod 8324129: C2: Remove some ttyLocker usages in preparation for JDK-8306767 Reviewed-by: kvn, vlivanov, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17486 From shade at openjdk.org Mon Jan 22 10:48:27 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 Jan 2024 10:48:27 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: <19qGKlnx8PjSXe_r4KO_Pr1-Ya1WuFvZghjjAUJFimo=.38862996-b5e5-4879-8703-fe44ce9f9cdf@github.com> On Sun, 21 Jan 2024 07:29:45 GMT, Kim Barrett wrote: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. Looks reasonable. I guess the use in `ClassLoaderData::remove_handle` is fine, because we want to assert it? Related, pre-existing: the use in `ClassLoaderData::print_on` is also odd. This reports the address of oophandle slot, not the classloader oop itself? Should probably be `.peek()`? out->print_cr(" - class loader " INTPTR_FORMAT, p2i(_class_loader.ptr_raw())); ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17510#pullrequestreview-1836174566 From jsjolen at openjdk.org Mon Jan 22 10:52:28 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Jan 2024 10:52:28 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: On Sun, 21 Jan 2024 07:29:45 GMT, Kim Barrett wrote: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. Oh, I didn't approve it in my last comment. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17510#pullrequestreview-1836183417 From jsjolen at openjdk.org Mon Jan 22 10:52:29 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Jan 2024 10:52:29 GMT Subject: RFR: 8324286: Fix backsliding on use of nullptr instead of NULL In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:39:12 GMT, Kim Barrett wrote: > Please review this change that removes some new (since JDK-8299837) uses of > NULL in HotSpot code. Most of the changes are in comments, replacing "NULL" > with either "nullptr" (for code snippets) or "null" (for textual description), > as was done for JDK-8299837. There are a small number of new uses of NULL in > code, which are replaced with nullptr. > > Testing: mach5 tier1 LGTM and trivial ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17516#pullrequestreview-1836182519 From epeter at openjdk.org Mon Jan 22 10:52:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 22 Jan 2024 10:52:59 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v23] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'master' into JDK-8306767 - add one nsv again - rm pause_no_safepoints - remove NoSafepointMutexLocker - improved comment for Roland - NoSafepointMutexLocker - Update src/hotspot/share/runtime/deoptimization.cpp rm empty line - change patch to deoptimization.cpp case brought up by Roland - fixed typo - refactor MethodData::bci_to_extra_data - remove redundant code - ... and 15 more: https://git.openjdk.org/jdk/compare/c84af493...5f471ff5 ------------- Changes: https://git.openjdk.org/jdk/pull/16840/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=22 Stats: 156 lines in 14 files changed: 110 ins; 16 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Mon Jan 22 11:14:41 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 22 Jan 2024 11:14:41 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v24] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks and no-safepoint-verifiers at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Complications with ttyl** > There were a few places in printing code, where did `ttyLocker ttyl;`, and then in that scope we would access the extra data. Now that I introduced locking with `extra_data_lock`, this ran into asserts which check the lock ranks: `ttyl` has a very low rank, and `extra_data_lock` a rather high lock. Hence, we cannot lock `extra_data_lock` inside a `ttyl` scope. > > If we were to simply remove the `ttyl` locking, then the many print lines inside that scope might be interrupted and another thread can insert other printing in between. To avoid that, I now first buffer all lines in a `stringStream`, and then print that buffered stream to `tty` all at once, which means no other printing can be injected in between. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: cleanup unnecessary changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/5f471ff5..ff581b05 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=22-23 Stats: 4 lines in 4 files changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From epeter at openjdk.org Mon Jan 22 11:19:31 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 22 Jan 2024 11:19:31 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v19] In-Reply-To: References: <8sr3y54p6pr5Fp4rq9DUIYUbmp92XJjxQqGcOY91kv8=.ea8c9500-476f-4fea-b4cb-debdaa2820c3@github.com> Message-ID: On Wed, 17 Jan 2024 16:16:19 GMT, Coleen Phillimore wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> improved comment for Roland > > I haven't yet reviewed all of this but this mechanism seems unnecessary and I'd like to understand why this would be added. @coleenp the other change is integrated and merged to here. @tkrodriguez @fisk @rwestrel would you mind (re-)revieing? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1903788801 From ayang at openjdk.org Mon Jan 22 11:38:47 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 Jan 2024 11:38:47 GMT Subject: RFR: 8324301: Obsolete MaxGCMinorPauseMillis Message-ID: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> Simple obsoleting a deprecated jvm flag. ------------- Commit messages: - obsolete-flag Changes: https://git.openjdk.org/jdk/pull/17517/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17517&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324301 Stats: 19 lines in 6 files changed: 1 ins; 17 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17517.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17517/head:pull/17517 PR: https://git.openjdk.org/jdk/pull/17517 From coleenp at openjdk.org Mon Jan 22 13:28:25 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 13:28:25 GMT Subject: RFR: 8324286: Fix backsliding on use of nullptr instead of NULL In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:39:12 GMT, Kim Barrett wrote: > Please review this change that removes some new (since JDK-8299837) uses of > NULL in HotSpot code. Most of the changes are in comments, replacing "NULL" > with either "nullptr" (for code snippets) or "null" (for textual description), > as was done for JDK-8299837. There are a small number of new uses of NULL in > code, which are replaced with nullptr. > > Testing: mach5 tier1 Looks good, can be argued as trivial. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17516#pullrequestreview-1836466403 From jwaters at openjdk.org Mon Jan 22 13:32:27 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 22 Jan 2024 13:32:27 GMT Subject: RFR: 8324286: Fix backsliding on use of nullptr instead of NULL In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:39:12 GMT, Kim Barrett wrote: > Please review this change that removes some new (since JDK-8299837) uses of > NULL in HotSpot code. Most of the changes are in comments, replacing "NULL" > with either "nullptr" (for code snippets) or "null" (for textual description), > as was done for JDK-8299837. There are a small number of new uses of NULL in > code, which are replaced with nullptr. > > Testing: mach5 tier1 Easy change to review, so I might as well ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/17516#pullrequestreview-1836473838 From coleenp at openjdk.org Mon Jan 22 13:42:26 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 13:42:26 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: On Sun, 21 Jan 2024 07:29:45 GMT, Kim Barrett wrote: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. Can ptr_raw be made private with this change? ------------- PR Review: https://git.openjdk.org/jdk/pull/17510#pullrequestreview-1836494219 From stuefe at openjdk.org Mon Jan 22 13:43:28 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 22 Jan 2024 13:43:28 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 08:25:58 GMT, Yi Yang wrote: > Great work ? I've seen too many stories where JVM/JNI memory leaks occur or allocators are reluctant to free memory to OS and are almost tracelessly terminated by Linux's OOM killer. Having the VM recognize and log this termination process can leave us with many clues for troubleshooting Thank you. I will pick up work on this - and refine it - in March probably. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1904027864 From coleenp at openjdk.org Mon Jan 22 13:46:27 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 13:46:27 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: On Sun, 21 Jan 2024 07:29:45 GMT, Kim Barrett wrote: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. Answer to my previous question. I guess not. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17510#pullrequestreview-1836503027 From coleenp at openjdk.org Mon Jan 22 13:51:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 13:51:34 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v8] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: <7PkGuAb9tWsKE3X7Y-nc6Mvw4uKXmYq2U2eKmios24M=.f2b5ca67-9844-42f6-a18a-61093c1b19a3@github.com> On Fri, 19 Jan 2024 09:46:57 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 977: >> >>> 975: jcc(Assembler::notZero, inflated); >>> 976: >>> 977: // Check if lock-stack is full. >> >> Why doesn't this call MacroAssembler::lightweight_lock here? > > The main idea is to keep it separate, where the MacroAssembler caters to the interpreters needs. > > The biggest difference is that the interpreter has one less registers, so it does some more juggling. > And on a fundamental level the interpreter must handles unstructured locking, while C2 does not. There is no practical difference to the lock/enter logic because of this difference, but there is for the unlock/exit logic. > > At some level the split should be that we have one implementation that handles the fixed amount of registers and unstructured locking scenario (interpreter). And one implementation which handles the case where we have a register allocator and and assumes structured locking (C1, C2). The native wrapper is somewhere in-between, is structured, but maybe has to use the first implementation due to register pressure. Ok, that makes sense and a good idea for a future enhancement. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1459944578 From coleenp at openjdk.org Mon Jan 22 13:51:29 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 13:51:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Fri, 19 Jan 2024 10:04:51 GMT, Axel Boldt-Christmas wrote: >> That sounds like a good idea. Will see how it ends up. > > It is a little bit awkward that they both have to restore the held monitor count. Ok, I've looked at the control flow more, which is not simple. Both have the slow path continuation() exit, and only the exiting with a successor has the unlocked exit. Can you rename continuation() to slow_path() because continuation doesn't help. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1459852218 From coleenp at openjdk.org Mon Jan 22 13:51:32 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 13:51:32 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> On Fri, 19 Jan 2024 08:38:07 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 120: >> >>> 118: // The owner may be anonymous and we removed the last obj entry in >>> 119: // the lock-stack. This loses the information about the owner. >>> 120: // Write the thread to the owner field so the runtime knows the owner. >> >> I'm confused by this comment. We get here if the monitor is inflated, so we didn't remove it from the lock stack. > > True. This comment was written when there was an explicit monitor check before the CAS that jumped to inflated. I am not sure if there is a situation where the owner is anonymous here now. > > It should be invariant that if a thread's lock stack does not contain the oop, performs an unlock/monitorexit, the monitor is inflated and the owner is not anonymous. > > At all places in the runtime when removing the oops from the lock stack the owner field is fixed. > > And in the emitted code the oop is pushed back to the lock stack incase of a failed unlock. > > There may be worth keeping this, and in the slow path after the CAS failed, check if it failed because of inflation, fix the owner field and jump back to the inflated fast path without transitioning to VM. I'm still confused by this. I added this, because iiuc this is an invariant here? +#ifdef ASSERT + Label skip; + __ testb(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), (int32_t) ObjectMonitor::ANONYMOUS_OWNER); + __ jccb(Assembler::equal, skip); + __ stop("owner is anonymous"); + __ bind(skip); +#endif + I don't suggest adding this but am still trying to understand this comment. >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1120: >> >>> 1118: xorptr(reg_rax, reg_rax); >>> 1119: orptr(reg_rax, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions))); >>> 1120: jcc(Assembler::notZero, check_successor); >> >> I don't know why the LP64/!LP64 paths are different. Do we not decrement recursions on 32 bit, and why wouldn't we? > > The idea was not to change the inflated unlocking in this PR. x86_32 does not handle recursions nor successor optimisation. I see no reason that they cannot be merged and just have 32bit use the 64bit logic. However the thinking was to keep that to a separate RFE. Ok, we can have a follow up RFE to consolidate this, so we don't have the conditional. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1461886993 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1459940998 From ngasson at openjdk.org Mon Jan 22 14:01:27 2024 From: ngasson at openjdk.org (Nick Gasson) Date: Mon, 22 Jan 2024 14:01:27 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:18:27 GMT, Andrew Haley wrote: > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. I tried a number of different machines and saw regressions only on Kunpeng-920 (same CPU?) and A57 which is quite niche at this point. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1904060889 From kbarrett at openjdk.org Mon Jan 22 14:58:26 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 22 Jan 2024 14:58:26 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 13:39:39 GMT, Coleen Phillimore wrote: > Can ptr_raw be made private with this change? Not yet. That's the general direction I was trying for, but I'm not sure it's achievable. At least not without some serious reworking of some OopHandle uses. But I found some little cleanups like this while I was exploring. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1904177364 From duke at openjdk.org Mon Jan 22 15:04:33 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 22 Jan 2024 15:04:33 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:14:28 GMT, Andrew Haley wrote: > > Could we instead make the last store to a final field in a constructor an STLR and remove the release barrier? > > It's possible, but it would be more work. I tried it before and failed. We can not bind the final field store with stlr, because the store which publish the new obj to heap can float above and cause trouble. It may work if bind the stlr with the publish store. But we may add multiple stlr. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1904188653 From duke at openjdk.org Mon Jan 22 15:12:26 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 22 Jan 2024 15:12:26 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 13:58:35 GMT, Nick Gasson wrote: > > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. > > I tried a number of different machines and saw regressions only on Kunpeng-920 (same CPU?) and A57 which is quite niche at this point. Thanks for test on other architecture. We may need a new arch dependent flag for it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1904205751 From aboldtch at openjdk.org Mon Jan 22 15:42:27 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 22 Jan 2024 15:42:27 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> Message-ID: On Mon, 22 Jan 2024 13:48:12 GMT, Coleen Phillimore wrote: >> True. This comment was written when there was an explicit monitor check before the CAS that jumped to inflated. I am not sure if there is a situation where the owner is anonymous here now. >> >> It should be invariant that if a thread's lock stack does not contain the oop, performs an unlock/monitorexit, the monitor is inflated and the owner is not anonymous. >> >> At all places in the runtime when removing the oops from the lock stack the owner field is fixed. >> >> And in the emitted code the oop is pushed back to the lock stack incase of a failed unlock. >> >> There may be worth keeping this, and in the slow path after the CAS failed, check if it failed because of inflation, fix the owner field and jump back to the inflated fast path without transitioning to VM. > > I'm still confused by this. I added this, because iiuc this is an invariant here? > > > +#ifdef ASSERT > + Label skip; > + __ testb(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), (int32_t) ObjectMonitor::ANONYMOUS_OWNER); > + __ jccb(Assembler::equal, skip); > + __ stop("owner is anonymous"); > + __ bind(skip); > +#endif > + > > > I don't suggest adding this but am still trying to understand this comment. I ran through tier1-tier7 with the even stronger invariant that the owner is the thread. The current comment (in the code) is outdated. I plan to push a fixed version, just wanted to figure out what makes the most sense. Given that the monitor check is elided now, the fixing the owner field should be removed. Alternatively let the slow path check for monitor after the CAS failed, and jump back to the inflated case. In this case the comment and the fixing of the owner field would be important. It is not immediately obvious to me that the alternative is worth it. Because we removed hashcode causing inflation the only point where this code would avoid going into the runtime by checking for inflated after a CAS fail is if another thread is in entering on the object monitor in parallel and we can do a direct handoff. Otherwise it would be on the entry / cxq queue and going into the runtime is required. The pragmatic solution would be to remove the store and add the assert for this PR. And create an RFE to evaluate this. Ran through with this code. ```c++ #ifdef ASSERT Label correct_owner; __ movptr(_t, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); __ cmpptr(_t, _thread); __ jccb(Assembler::equal, correct_owner); __ testptr(_t, Address(_t)); // Crash on bad address __ stop("Bad owner"); __ bind (correct_owner); #endif I will rebase and push the changes tomorrow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1462044892 From coleenp at openjdk.org Mon Jan 22 15:59:29 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Jan 2024 15:59:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> Message-ID: On Mon, 22 Jan 2024 15:39:24 GMT, Axel Boldt-Christmas wrote: >> I'm still confused by this. I added this, because iiuc this is an invariant here? >> >> >> +#ifdef ASSERT >> + Label skip; >> + __ testb(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), (int32_t) ObjectMonitor::ANONYMOUS_OWNER); >> + __ jccb(Assembler::equal, skip); >> + __ stop("owner is anonymous"); >> + __ bind(skip); >> +#endif >> + >> >> >> I don't suggest adding this but am still trying to understand this comment. > > I ran through tier1-tier7 with the even stronger invariant that the owner is the thread. > The current comment (in the code) is outdated. > > I plan to push a fixed version, just wanted to figure out what makes the most sense. > Given that the monitor check is elided now, the fixing the owner field should be removed. > Alternatively let the slow path check for monitor after the CAS failed, and jump back to the > inflated case. In this case the comment and the fixing of the owner field would be important. > > It is not immediately obvious to me that the alternative is worth it. Because we removed hashcode causing inflation the only point where this code would avoid going into the runtime by checking for inflated after a CAS fail is if another thread is in entering on the object monitor in parallel and we can do a direct handoff. Otherwise it would be on the entry / cxq queue and going into the runtime is required. > The pragmatic solution would be to remove the store and add the assert for this PR. And create an RFE to evaluate this. > > Ran through with this code. > ```c++ > #ifdef ASSERT > Label correct_owner; > __ movptr(_t, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); > __ cmpptr(_t, _thread); > __ jccb(Assembler::equal, correct_owner); > __ testptr(_t, Address(_t)); // Crash on bad address > __ stop("Bad owner"); > __ bind (correct_owner); > #endif > > > I will rebase and push the changes tomorrow. I like how with your patch all the knowledge about the anonymous owner is in synchronizer.cpp, which is why this comment and code would be nicer to be absent. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1462069167 From eastigeevich at openjdk.org Mon Jan 22 16:13:31 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 22 Jan 2024 16:13:31 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 186: > 184: } else { > 185: log_debug(codecache)("CodeCache minimum size fail for %s %lld vs %lld", > 186: codeheap, (long long) size, (long long) required_size); log_debug(codecache)("Code heap (%s) size " SIZE_FORMAT " below required minimal size " SIZE_FORMAT, code_heap, size, required_size); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1462087702 From aturbanov at openjdk.org Mon Jan 22 16:23:32 2024 From: aturbanov at openjdk.org (Andrey Turbanov) Date: Mon, 22 Jan 2024 16:23:32 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 01:58:32 GMT, kuaiwei wrote: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java line 70: > 68: public TObj() { > 69: i = 10; > 70: l = 100l; Suggestion: l = 100L; test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java line 82: > 80: public TObjWithFinal() { > 81: i = 10; > 82: l = 100l; Suggestion: l = 100L; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1462100254 PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1462100420 From eastigeevich at openjdk.org Mon Jan 22 16:23:32 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 22 Jan 2024 16:23:32 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: <7Hc2-ShW7e6KEQq2kXn7MdWF-6tr2-QIFKqSRxtcMzM=.af16d51a-49fa-40be-9eb0-de558470d915@github.com> On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 196: > 194: struct CodeHeapInfo { > 195: size_t size; > 196: bool set; IMO, having `CodeBlobType type` member would be useful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1462100905 From eastigeevich at openjdk.org Mon Jan 22 19:29:28 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 22 Jan 2024 19:29:28 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 260: > 258: non_profiled.set = true; > 259: non_profiled.enabled = false; > 260: } It can be rewritten to the shorter version: // If either heap1 or heap2 is not available, its size is added to the size of the available heap. static void reuse_unavailable_heap_size(CodeHeapInfo *heap1, CodeHeapInfo *heap2) { assert(CodeCache::heap_available(heap1->type) || CodeCache::heap_available(heap2->type), "At least one code heap must be available"); if (!CodeCache::heap_available(heap1->type)) { swap(heap1, heap2); } else if (CodeCache::heap_available(heap2->type)) { // Both code heaps are available. Nothing needs to be done. return; } heap1->size += heap2->size; heap2->size = 0; heap2->set = true; heap2->enable = false; } Now we can replace those two IFs with: // For compatibility we add the size of an unavailable code heap to the size of the available heap. reuse_unavailable_heap_size(&non_profiled, &profiled); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1462310190 From eastigeevich at openjdk.org Mon Jan 22 19:48:30 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 22 Jan 2024 19:48:30 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <7Hc2-ShW7e6KEQq2kXn7MdWF-6tr2-QIFKqSRxtcMzM=.af16d51a-49fa-40be-9eb0-de558470d915@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> <7Hc2-ShW7e6KEQq2kXn7MdWF-6tr2-QIFKqSRxtcMzM=.af16d51a-49fa-40be-9eb0-de558470d915@github.com> Message-ID: On Mon, 22 Jan 2024 16:21:02 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> apply suggestions > > src/hotspot/share/code/codeCache.cpp line 196: > >> 194: struct CodeHeapInfo { >> 195: size_t size; >> 196: bool set; > > IMO, having `CodeBlobType type` member would be useful. IMO, having `size_t min_size` would be useful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1462336805 From eastigeevich at openjdk.org Mon Jan 22 19:56:30 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 22 Jan 2024 19:56:30 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 264: > 262: size_t compiler_buffer_size = 0; > 263: COMPILER1_PRESENT(compiler_buffer_size += CompilationPolicy::c1_count() * Compiler::code_buffer_size()); > 264: COMPILER2_PRESENT(compiler_buffer_size += CompilationPolicy::c2_count() * C2Compiler::initial_code_buffer_size()); We can move the code inside the following IF. We can move ` size_t non_nmethod_min_size = min_cache_size` before the IF and adjust it in the IF. This could be simplified with `CodeHeapInfo::min_size`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1462346756 From eastigeevich at openjdk.org Mon Jan 22 20:00:29 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 22 Jan 2024 20:00:29 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 272: > 270: if (!profiled.set && !non_profiled.set) { > 271: non_profiled.size = profiled.size = (cache_size > non_nmethod.size + 2 * min_size) ? > 272: (cache_size - non_nmethod.size) / 2 : min_size; The code calculating default sizes is not simple. Are you sure your code produces the same results? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1462351067 From kvn at openjdk.org Mon Jan 22 21:09:27 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Jan 2024 21:09:27 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: <2ZFeRJC65HueklUUcFw4Oo9Amagbsz9AAFiOsFs2eGk=.88cc0d04-b978-40ca-a9e7-c3f45f7b191e@github.com> On Sat, 20 Jan 2024 18:39:25 GMT, Quan Anh Mai wrote: > This seems similar to [a recent discussion](https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071521.html). There, it is decided that a release barrier would be safer. Should we do it similarly here? Thanks. I think, if decided, it should be done in separate RFE uniformly in all places: Interpreter, C1, C2 and here in deoptimization code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17503#issuecomment-1904816574 From kvn at openjdk.org Mon Jan 22 21:17:41 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Jan 2024 21:17:41 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization [v2] In-Reply-To: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: > Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. > > I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. > > Tested tier1-3, scope, stress. > > No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Restore COMPILER2_OR_JVMCI changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17503/files - new: https://git.openjdk.org/jdk/pull/17503/files/49d44048..392135ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17503&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17503&range=00-01 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17503.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17503/head:pull/17503 PR: https://git.openjdk.org/jdk/pull/17503 From kvn at openjdk.org Mon Jan 22 21:17:43 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Jan 2024 21:17:43 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization [v2] In-Reply-To: <6P3xTF6Fklma9FNk0A3b8sqv6MGxmPEnoeCLZ_fAiwY=.84bce8bf-6a17-4950-bab7-cf0f3d0254f2@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> <6P3xTF6Fklma9FNk0A3b8sqv6MGxmPEnoeCLZ_fAiwY=.84bce8bf-6a17-4950-bab7-cf0f3d0254f2@github.com> Message-ID: On Mon, 22 Jan 2024 09:47:25 GMT, Aleksey Shipilev wrote: > It looks good, but let's not put unrelated changes together? I think the `COMPILER2_OR_JVMCI` should come in as a separate atomic change. This will, for example, allow to cleanly backport `storestore` additions without looking back whether the vector support enablement hunks make sense. Thank you, Aleksey Right, backports. I removed `COMPILER2_OR_JVMCI` changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17503#issuecomment-1904825930 From shade at openjdk.org Mon Jan 22 21:17:42 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 Jan 2024 21:17:42 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization [v2] In-Reply-To: References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: <7abL_I4fUq351ipDpf08el3781EZXfKaWtEoaksnnGw=.bbc4b1d5-4ab5-4f29-ad3d-37b5ff7cbf29@github.com> On Mon, 22 Jan 2024 21:14:17 GMT, Vladimir Kozlov wrote: >> Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. >> >> I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. >> >> Tested tier1-3, scope, stress. >> >> No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Restore COMPILER2_OR_JVMCI changes Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17503#pullrequestreview-1837385824 From kvn at openjdk.org Mon Jan 22 21:51:26 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Jan 2024 21:51:26 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization [v2] In-Reply-To: References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Mon, 22 Jan 2024 21:17:41 GMT, Vladimir Kozlov wrote: >> Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. >> >> I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. >> >> Tested tier1-3, scope, stress. >> >> No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Restore COMPILER2_OR_JVMCI changes Thank you, Dean, Aleksey and Quan for reviews and comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17503#issuecomment-1904879882 From kvn at openjdk.org Mon Jan 22 22:53:30 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Jan 2024 22:53:30 GMT Subject: Integrated: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> Message-ID: On Fri, 19 Jan 2024 22:00:15 GMT, Vladimir Kozlov wrote: > Added missing store-store barrier when we re-materialize scalar replaced object during deoptimization. > > I also removed redundant `#if COMPILER2_OR_JVMCI` guards which were leftover from [JDK-8312579](https://bugs.openjdk.org/browse/JDK-8312579) changes. It added Vector API support to Graal and changed `#ifdef COMPILER2` to these `#if`. But this code is already under these `ifs`. > > Tested tier1-3, scope, stress. > > No new regression test. I think it is "almost" impossible to hit this issue because there is a lot of VM's runtime code between the code which rematerialize scalar-replaced objects during deoptimization and a code in Interpreter which is executed after deoptimization and which may execute a store instruction that makes these objects accessible by other threads. This pull request has now been integrated. Changeset: 52523d33 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/52523d33dde797bf03b15a05bb227b19b22c06be Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod 8324050: Issue store-store barrier after re-materializing objects during deoptimization Reviewed-by: dlong, shade ------------- PR: https://git.openjdk.org/jdk/pull/17503 From dholmes at openjdk.org Tue Jan 23 01:35:27 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 23 Jan 2024 01:35:27 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:48:23 GMT, Aleksey Shipilev wrote: > Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. > > I provisionally call this flag `external-dep`, but I am open for other suggestions. > > Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. > > Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. > > Additional testing: > - [x] `make test TEST=applications/` fails > - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests Seems quite reasonable. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17421#pullrequestreview-1837713758 From fyang at openjdk.org Tue Jan 23 02:31:28 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 23 Jan 2024 02:31:28 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 15:09:38 GMT, kuaiwei wrote: > > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. > > I tried a number of different machines and saw regressions only on Kunpeng-920 (same CPU?) and A57 which is quite niche at this point. @nick-arm : Thanks for trying it out. Yeah, TSV110 is the core micro-arch name for Kunpeng-920. @theRealAph : I don't have access to the details of TSV110 any more, I guess it's not easy to figure out what's going on :-( ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1905181538 From sroy at openjdk.org Tue Jan 23 05:34:32 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 23 Jan 2024 05:34:32 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <_WrW-iHHdU-IgC7Z1b6oe_Qh0dkC6P3KJAdl7J2S1Do=.712dd065-6207-4632-a82f-8e12ad023cd5@github.com> <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> Message-ID: On Thu, 21 Dec 2023 10:01:04 GMT, Thomas Stuefe wrote: >>> > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? >>> >>> > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. >>> >>> No, what I meant, and what must be clarified before going forward with this solution, is the following: >>> >>> * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object >>> * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. >>> >>> Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. >>> If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. >>> >> In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. >> If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? >> >> >>> > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? >>> >>> > I am not sure how J9 handles this. I would have to consult . >>> >>> J9 is Open Source, can't you just look? :) >> >> I did try comparing the file structures, and i do not see a similar file structure over there. >> I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? >>> >>> > However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. >>> >>> I don't know what Semuru is. What is the context, is that a different VM? Also OpenJDK? J9 derived? >> >> >> Semuru is J9 derived. > >> > > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? >> > >> > >> > > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. >> > >> > >> > No, what I meant, and what must be clarified before going forward with this solution, is the following: >> > >> > * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object >> > * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. >> > >> > Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. >> > If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. >> >> In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? > > Rather, this is a question you have to ask your collegues at IBM that develop the AIX libc. > > Since AIX libc is not open source, we cannot look for ourselves, nor can Joachim (her works at SAP). > >> >> > > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? >> > >> > >> > > I am not sure how J9 handles this. I would have to consult . >> > >> > >> > J9 is Open Source, can't you just look? :) >> >> I did try comparing the file structures, and i do not see a similar file structure over there. I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? > > Someone must implement LoadLibrary. Try looking for places where dlopen() is called. > >> >> > > However as per current observation, this issue does ... Hi @tstuefe Any further clarifications ,required for this change ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1905320564 From sroy at openjdk.org Tue Jan 23 05:34:33 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 23 Jan 2024 05:34:33 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Tue, 16 Jan 2024 16:12:16 GMT, Martin Doerr wrote: > > I have tried to build jextract (https://github.com/openjdk/jextract/tree/jdk22) with LLVM (https://github.com/llvm/llvm-project/releases/download/llvmorg-16.0.4/clang+llvm-16.0.4-powerpc64-ibm-aix-7.2.tar.xz). I noticed that llvm mainly consists of .a files. So, I think we need to support that for FFI compatibility with other libraries and open source projects. > > Seems like this change is not sufficient for that. `clang` is compiled to `libclang.a` on AIX, but `libclang.so` on linux. I'm getting "System error: Exec format error" when trying to load `libclang.a` via `System.loadLibrary(libName);`. So the question remains: Are .a files really supposed to be dynamically loadable on AIX? If so, where is that documented? Yes we have rhe case where .a files are dynamically loadable as well on AIX. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1905321150 From qamai at openjdk.org Tue Jan 23 08:18:41 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 08:18:41 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler Message-ID: Hi, This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. Please kindly give your opinion as well as your reviews, thanks very much. ------------- Commit messages: - bug number - add isConstantExpression Changes: https://git.openjdk.org/jdk/pull/17527/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324433 Stats: 162 lines in 6 files changed: 158 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From shade at openjdk.org Tue Jan 23 08:18:41 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 08:18:41 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 08:10:54 GMT, Quan Anh Mai wrote: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. > > Please kindly give your opinion as well as your reviews, thanks very much. Nice. I had a similar thing stashed in my todo queue. Note that there is already `isCompileConstant` that does similar thing: https://github.com/openjdk/jdk/blob/5a74c2a67ebcb47e51732f03c4be694bdf920469/src/hotspot/share/opto/library_call.cpp#L8189-L8193 -- maybe we should just expose that more widely. I would suggest we just do the private `java.lang.{Integer,...}.isCompileConstant` methods and bind them to that intrinsic. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1905504206 From kbarrett at openjdk.org Tue Jan 23 08:49:42 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 08:49:42 GMT Subject: RFR: 8324286: Fix backsliding on use of nullptr instead of NULL [v2] In-Reply-To: References: Message-ID: > Please review this change that removes some new (since JDK-8299837) uses of > NULL in HotSpot code. Most of the changes are in comments, replacing "NULL" > with either "nullptr" (for code snippets) or "null" (for textual description), > as was done for JDK-8299837. There are a small number of new uses of NULL in > code, which are replaced with nullptr. > > Testing: mach5 tier1 Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into cleanup-null - fix backsliding on NULL usage ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17516/files - new: https://git.openjdk.org/jdk/pull/17516/files/0ce29eed..bd53e8b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17516&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17516&range=00-01 Stats: 488 lines in 37 files changed: 298 ins; 38 del; 152 mod Patch: https://git.openjdk.org/jdk/pull/17516.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17516/head:pull/17516 PR: https://git.openjdk.org/jdk/pull/17516 From kbarrett at openjdk.org Tue Jan 23 08:49:43 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 08:49:43 GMT Subject: RFR: 8324286: Fix backsliding on use of nullptr instead of NULL [v2] In-Reply-To: References: Message-ID: <3dYoAsN5iYhnL1TvPya4iTXnk_UD0AfShDXV29LGVS0=.038f8fb8-a2f1-4f7c-aaaa-4c640f397dd1@github.com> On Mon, 22 Jan 2024 10:49:43 GMT, Johan Sj?len wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into cleanup-null >> - fix backsliding on NULL usage > > LGTM and trivial Thanks for reviews @jdksjolen , @coleenp , and @TheShermanTanker . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17516#issuecomment-1905573378 From kbarrett at openjdk.org Tue Jan 23 08:52:35 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 08:52:35 GMT Subject: Integrated: 8324286: Fix backsliding on use of nullptr instead of NULL In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:39:12 GMT, Kim Barrett wrote: > Please review this change that removes some new (since JDK-8299837) uses of > NULL in HotSpot code. Most of the changes are in comments, replacing "NULL" > with either "nullptr" (for code snippets) or "null" (for textual description), > as was done for JDK-8299837. There are a small number of new uses of NULL in > code, which are replaced with nullptr. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: bcb340da Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/bcb340da091e3287da8d2ecfcd017ebcc6613cae Stats: 19 lines in 9 files changed: 0 ins; 0 del; 19 mod 8324286: Fix backsliding on use of nullptr instead of NULL Reviewed-by: jsjolen, coleenp, jwaters ------------- PR: https://git.openjdk.org/jdk/pull/17516 From qamai at openjdk.org Tue Jan 23 09:20:46 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 09:20:46 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v2] In-Reply-To: References: Message-ID: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. > > Please kindly give your opinion as well as your reviews, thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: use inline_isCompileConstant ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17527/files - new: https://git.openjdk.org/jdk/pull/17527/files/4d0fc3dd..9dd95393 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=00-01 Stats: 13 lines in 3 files changed: 0 ins; 9 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From qamai at openjdk.org Tue Jan 23 09:30:26 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 09:30:26 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 08:16:07 GMT, Aleksey Shipilev wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Nice. I had a similar thing stashed in my todo queue. Note that there is already `isCompileConstant` that does similar thing: https://github.com/openjdk/jdk/blob/5a74c2a67ebcb47e51732f03c4be694bdf920469/src/hotspot/share/opto/library_call.cpp#L8189-L8193 -- maybe we should just expose that more widely. I would suggest we just do the private `java.lang.{Integer,...}.isCompileConstant` methods and bind them to that intrinsic. @shipilev Thanks a lot for your suggestions. Yes I can just use `inline_isCompileConstant` instead. Regarding the place of the method, I'm not really sure as putting in `java.lang.Long` seems out-of-place for an internal mechanism that is obviously not only used in `java.lang`, which will force a new entry in `JavaLangAccess`. Finally, I think accepting a `long` would be enough (maybe `double`, too?) since `int`, `boolean` etc can be converted losslessly to `long`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1905641141 From qamai at openjdk.org Tue Jan 23 09:34:27 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 09:34:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 08:16:07 GMT, Aleksey Shipilev wrote: > I would suggest we just do the private `java.lang.{Integer,...}.isCompileConstant` methods and bind them to that intrinsic. Maybe I am ignorant but doesn't the definition of an intrinsics contain the signature of the method as well? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1905648723 From shade at openjdk.org Tue Jan 23 09:38:24 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 09:38:24 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 08:16:07 GMT, Aleksey Shipilev wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Nice. I had a similar thing stashed in my todo queue. Note that there is already `isCompileConstant` that does similar thing: https://github.com/openjdk/jdk/blob/5a74c2a67ebcb47e51732f03c4be694bdf920469/src/hotspot/share/opto/library_call.cpp#L8189-L8193 -- maybe we should just expose that more widely. I would suggest we just do the private `java.lang.{Integer,...}.isCompileConstant` methods and bind them to that intrinsic. > @shipilev Thanks a lot for your suggestions. Yes I can just use `inline_isCompileConstant` instead. > > Regarding the place of the method, I'm not really sure as putting in `java.lang.Long` seems out-of-place for an internal mechanism that is obviously not only used in `java.lang`, which will force a new entry in `JavaLangAccess`. Ah yes, if you need to use it across module boundaries, putting the private/protected method would require `JavaLangAccess`, which is burdensome. I am just icky about introducing a whole new internal class for this. Is there anything in current `jdk.internal.vm.*` that fits it? Maybe `misc.Unsafe` or `misc.VM`? > Finally, I think accepting a `long` would be enough (maybe `double`, too?) since `int`, `boolean` etc can be converted losslessly to `long`. Right, that would work for primitives, since we could probably rely on conversion for constants to be folded. But I also see the value for asking `isCompileConstant(Object)`, which is not easily convertible. So I would just do the overloads for all primitives and `Object`. The C2 intrinsic would not care about the `arg(0)` type, it would reply `isCon` on those constants just the same. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1905655407 From shade at openjdk.org Tue Jan 23 09:45:27 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 09:45:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler In-Reply-To: References: Message-ID: <9xpFtQX-wVtXnENNNiQZZFJqa7cy-n7_yS6uU3UjsQ8=.55c6a9b6-0e89-45f7-bfb9-11b13f9ef605@github.com> On Tue, 23 Jan 2024 09:31:51 GMT, Quan Anh Mai wrote: > Maybe I am ignorant but doesn't the definition of an intrinsics contain the signature of the method as well? The definitions in `vmIntrinsics`, sure, they require full signature for `@IntrinsicCandidate` methods. It would yield some unfortunate duplication. But after that, we can map on the same `inline_isCompileConstant` intrinsic that just asks `arg(0)->is_Con()`, and it would not care about the type of the constant. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1905667006 From aph at openjdk.org Tue Jan 23 09:50:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 23 Jan 2024 09:50:28 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 02:28:48 GMT, Fei Yang wrote: > > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. > > I tried a number of different machines and saw regressions only on Kunpeng-920 (same CPU?) and A57 which is quite niche at this point. Right, so it's probably a low-end, mostly-in-order thing. That makes sense because we're trading a weaker barrier for more instructions, and perhaps some cores implement barriers in a crude one-size-fits-all way. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1905674792 From shade at openjdk.org Tue Jan 23 10:15:28 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 10:15:28 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 01:32:45 GMT, David Holmes wrote: > Seems quite reasonable. Thanks! I shall wait for more reviewers, in case someone has an issue with `external-dep` as the flag name. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1905719123 From ayang at openjdk.org Tue Jan 23 10:26:32 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Jan 2024 10:26:32 GMT Subject: RFR: 8324512: Serial: Remove Generation::Name Message-ID: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> Trivial removing redundant code. ------------- Commit messages: - s1-enum Changes: https://git.openjdk.org/jdk/pull/17530/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17530&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324512 Stats: 66 lines in 9 files changed: 0 ins; 64 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17530.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17530/head:pull/17530 PR: https://git.openjdk.org/jdk/pull/17530 From mli at openjdk.org Tue Jan 23 10:52:30 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 23 Jan 2024 10:52:30 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:14:05 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> JMH_PARAMS="-f 1 -wi 10 -i 20" // for every loop of jmh test >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`, more spcifically `TESTS="MessageDigests.digest MessageDigests.getAndDigest MessageDigestBench.digest"` >> >> >> // After >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 1845.446 ? 27.052 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 181455.350 ? 532.258 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2447.674 ? 10.239 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 182896.083 ? 1242.774 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 11599227.792 ? 121442.390 ns/op >> // Before >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 2352.475 ? 11.198 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 188495.684 ? 1467.942 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2437.347 ? 6.398 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 196086.570 ? 1140.998 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 12362160.119 ? 38788.109 ns/op >> >> >> **getAndDigest when size == 64** >> The data is not stable for test getAndDigest when size == 64, which I think is introduced by j.s.MessageDigest.getInstance itself, which we don't touch in this patch. >> Check more details at [1](ht... > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - remove tp/gp > - refine code ![image](https://github.com/openjdk/jdk/assets/10797965/b48b5ff7-8049-4857-afa2-74a7b3120d23) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1905780458 From kbarrett at openjdk.org Tue Jan 23 10:57:36 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 10:57:36 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle Message-ID: Please review this change to the lazy initialization of the MemoryManager object and the associated MemoryPool objects. They previously used an atomic access to the respective OopHandle member holding the associated Java object as the is-initialized sentinal, testing whether the handle was empty or had an associated OopStorage entry. When empty, initialization was performed using a lock to prevent races. Now they use a separate atomic is-initialized flag as the sentinal. As a result, the support for atomic access to an OopHandle's underlying handle (via a translator) is no longer needed and is removed. While there, I moved the allocation of the associated OopStorage entries out from under the Management_lock. Testing: mach5 tier1 A couple of notes for reviewers. Once initialized with a Java object recorded in the associated OopHandle, the OopHandle and the value recorded therein is never changed. The old is-initialized check makes use of OopHandle::resolve returning null if either the handle is empty (has no OopStorage entry yet) or the OopStorage entry contains null. The latter never happens in this case. ------------- Commit messages: - remove unused OopHandle translator - MemoryPool doesn't use atomics on OopHandle - MemoryManager doesn't use atomics on OopHandle Changes: https://git.openjdk.org/jdk/pull/17533/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17533&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324492 Stats: 101 lines in 5 files changed: 24 ins; 14 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/17533.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17533/head:pull/17533 PR: https://git.openjdk.org/jdk/pull/17533 From qamai at openjdk.org Tue Jan 23 11:18:43 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 11:18:43 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: Message-ID: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. > > Please kindly give your opinion as well as your reviews, thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: add more overloads ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17527/files - new: https://git.openjdk.org/jdk/pull/17527/files/9dd95393..31403d6f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=01-02 Stats: 346 lines in 6 files changed: 333 ins; 1 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From kbarrett at openjdk.org Tue Jan 23 11:24:26 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 11:24:26 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: <19qGKlnx8PjSXe_r4KO_Pr1-Ya1WuFvZghjjAUJFimo=.38862996-b5e5-4879-8703-fe44ce9f9cdf@github.com> References: <19qGKlnx8PjSXe_r4KO_Pr1-Ya1WuFvZghjjAUJFimo=.38862996-b5e5-4879-8703-fe44ce9f9cdf@github.com> Message-ID: On Mon, 22 Jan 2024 10:45:35 GMT, Aleksey Shipilev wrote: > Looks reasonable. > > I guess the use in `ClassLoaderData::remove_handle` is fine, because we want to assert it? I forgot about this one. I initially skipped it because I didn't understand what was happening here. I'd forgotten that the CLD uses OopHandle to wrap a pointer to the CLD handle area rather than an OopStorage entry. I'll push a fix after I've run tests. > Related, pre-existing: the use in `ClassLoaderData::print_on` is also odd. This reports the address of oophandle slot, not the classloader oop itself? Should probably be `.peek()`? > > ``` > out->print_cr(" - class loader " INTPTR_FORMAT, p2i(_class_loader.ptr_raw())); > ``` I think you are probably correct, but felt that was a little out of scope for what I was doing in this PR. It looks like it was changed from printing the (void* cast) loader oop to printing the (newly introduced) handle address by 8201556: "Disallow reading oops in ClassLoaderData if unloading". I don't know whether that was intentional, but guessing not. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1905834809 From qamai at openjdk.org Tue Jan 23 11:26:26 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 11:26:26 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: Message-ID: <3pUibPS3iTDkTve0coFDSCFt61RLryW5Hc7Jve6Cfk8=.9bb8d2f8-59fd-4958-8d04-b8b13f17b7b3@github.com> On Tue, 23 Jan 2024 11:18:43 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > add more overloads I get your idea. I have added overloads for all types. They will all invoke `inlint_isCompileConstant`. Given that there are now 7 methods I think a separate class is more justified. Another issue is the duplication of `isConstantExpression(Object)`, but I think a separate issue to deduplicate it would be easier. Thanks a lot. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1905836929 From kbarrett at openjdk.org Tue Jan 23 11:32:26 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 11:32:26 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: <19qGKlnx8PjSXe_r4KO_Pr1-Ya1WuFvZghjjAUJFimo=.38862996-b5e5-4879-8703-fe44ce9f9cdf@github.com> Message-ID: <9orPGubHcUhcckiujZYjiPvJkXTmafuIo_anb9pLKwQ=.2df0c8ff-4b76-4c29-a9ff-0e544a5c3d7f@github.com> On Tue, 23 Jan 2024 11:22:12 GMT, Kim Barrett wrote: > > Related, pre-existing: the use in `ClassLoaderData::print_on` is also odd. This reports the address of oophandle slot, not the classloader oop itself? Should probably be `.peek()`? > > ``` > > out->print_cr(" - class loader " INTPTR_FORMAT, p2i(_class_loader.ptr_raw())); > > ``` > > I think you are probably correct, but felt that was a little out of scope for what I was doing in this PR. > > It looks like it was changed from printing the (void* cast) loader oop to printing the (newly introduced) handle address by 8201556: "Disallow reading oops in ClassLoaderData if unloading". I don't know whether that was intentional, but guessing not. https://bugs.openjdk.org/browse/JDK-8324514 ClassLoaderData::print_on should print address of class loader ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1905845825 From shade at openjdk.org Tue Jan 23 11:45:28 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 11:45:28 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: <9orPGubHcUhcckiujZYjiPvJkXTmafuIo_anb9pLKwQ=.2df0c8ff-4b76-4c29-a9ff-0e544a5c3d7f@github.com> References: <19qGKlnx8PjSXe_r4KO_Pr1-Ya1WuFvZghjjAUJFimo=.38862996-b5e5-4879-8703-fe44ce9f9cdf@github.com> <9orPGubHcUhcckiujZYjiPvJkXTmafuIo_anb9pLKwQ=.2df0c8ff-4b76-4c29-a9ff-0e544a5c3d7f@github.com> Message-ID: On Tue, 23 Jan 2024 11:29:29 GMT, Kim Barrett wrote: > https://bugs.openjdk.org/browse/JDK-8324514 ClassLoaderData::print_on should print address of class loader Thanks. I'll handle that one, so you don't need to go too far into this rabbit hole :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1905864593 From jlahoda at openjdk.org Tue Jan 23 11:45:28 2024 From: jlahoda at openjdk.org (Jan Lahoda) Date: Tue, 23 Jan 2024 11:45:28 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests I have nothing against it, but I don't think I know enough about details of jtreg configuration to provide approval. OTOH, I personally don't see a strong need for it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17422#issuecomment-1905864727 From aph at openjdk.org Tue Jan 23 12:25:35 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 23 Jan 2024 12:25:35 GMT Subject: RFR: 8324050: Issue store-store barrier after re-materializing objects during deoptimization In-Reply-To: <2ZFeRJC65HueklUUcFw4Oo9Amagbsz9AAFiOsFs2eGk=.88cc0d04-b978-40ca-a9e7-c3f45f7b191e@github.com> References: <74qcszyXD2a9NItAm5BWRj052rthJt2hRf5Wiu0qjvU=.3f8e5aa9-07b5-4b8b-96db-a37d2ec55347@github.com> <2ZFeRJC65HueklUUcFw4Oo9Amagbsz9AAFiOsFs2eGk=.88cc0d04-b978-40ca-a9e7-c3f45f7b191e@github.com> Message-ID: <2smMZTgweJmNpgREGK3k-mX-nERxfwA2Fl06PM-pakw=.7a90ecf6-2e89-4913-88eb-5b68a5790a51@github.com> On Mon, 22 Jan 2024 21:06:17 GMT, Vladimir Kozlov wrote: > > This seems similar to [a recent discussion](https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071521.html). There, it is decided that a release barrier would be safer. Should we do it similarly here? Thanks. > > I think, if decided, it should be done in separate RFE uniformly in all places: Interpreter, C1, C2 and here in deoptimization code. That would be the wrong thing to do, because the risk that leads us (possibly? - discuss) to need release can only happen in an optimizing compiler. AFAIK... ------------- PR Comment: https://git.openjdk.org/jdk/pull/17503#issuecomment-1905949873 From stefank at openjdk.org Tue Jan 23 13:19:28 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 23 Jan 2024 13:19:28 GMT Subject: RFR: 8324512: Serial: Remove Generation::Name In-Reply-To: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> References: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> Message-ID: On Tue, 23 Jan 2024 10:20:46 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17530#pullrequestreview-1838769192 From coleenp at openjdk.org Tue Jan 23 13:56:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Jan 2024 13:56:30 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> Message-ID: On Mon, 22 Jan 2024 15:56:45 GMT, Coleen Phillimore wrote: >> I ran through tier1-tier7 with the even stronger invariant that the owner is the thread. >> The current comment (in the code) is outdated. >> >> I plan to push a fixed version, just wanted to figure out what makes the most sense. >> Given that the monitor check is elided now, the fixing the owner field should be removed. >> Alternatively let the slow path check for monitor after the CAS failed, and jump back to the >> inflated case. In this case the comment and the fixing of the owner field would be important. >> >> It is not immediately obvious to me that the alternative is worth it. Because we removed hashcode causing inflation the only point where this code would avoid going into the runtime by checking for inflated after a CAS fail is if another thread is in entering on the object monitor in parallel and we can do a direct handoff. Otherwise it would be on the entry / cxq queue and going into the runtime is required. >> The pragmatic solution would be to remove the store and add the assert for this PR. And create an RFE to evaluate this. >> >> Ran through with this code. >> ```c++ >> #ifdef ASSERT >> Label correct_owner; >> __ movptr(_t, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner))); >> __ cmpptr(_t, _thread); >> __ jccb(Assembler::equal, correct_owner); >> __ testptr(_t, Address(_t)); // Crash on bad address >> __ stop("Bad owner"); >> __ bind (correct_owner); >> #endif >> >> >> I will rebase and push the changes tomorrow. > > I like how with your patch all the knowledge about the anonymous owner is in synchronizer.cpp, which is why this comment and code would be nicer to be absent. Ok now I see. We've popped off the only entry of the lock stack (not recursive), but if some other thread has inflated the monitor due to contention, we'll fail this CAS. // Try to unlock. Transition lock bits 0b00 => 0b01 movptr(reg_rax, mark); andptr(reg_rax, ~(int32_t)markWord::lock_mask); orptr(mark, markWord::unlocked_value); lock(); cmpxchgptr(mark, Address(obj, oopDesc::mark_offset_in_bytes())); jcc(Assembler::notEqual, push_and_slow_path); jmp(unlocked); The alternative you're suggesting is to instead of pushing back on the lock-stack, go to check_successor (which will be non-null due to the contention) and the handoff will be faster that way. Then we have to restore the owner thread. Is the lock definitely never inflated due to hash code in this case now? Let's explain this in another RFE and we can see if benchmark results support this optimization. Pushing back on the lock stack and going slow path is simpler (less assembly code) and matches the interpreter/c1 case. But let's see after this is checked in. Your assert with some adjustment to the comment looks good. I was trying to remove the concept of anonymous owner from the assembly code because of another change I was working on. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463316791 From aboldtch at openjdk.org Tue Jan 23 14:00:44 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:00:44 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v12] In-Reply-To: References: Message-ID: > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits: - Added comment about the rational behind full lock stack inflation. May need rewording - Add logging when lock stack capacity is exceeded. - Remove inaccurate comment - Correct nomenclature balanced vs structured. - Avoid else after return - Improve is_recursive, optimize the common balanced runtime is_recursive check - Add is_recursive documentation - Add try_recursive_exit documentation - Add try_recursive_enter documentation - Add try_recursive_enter precondition documentation - ... and 34 more: https://git.openjdk.org/jdk/compare/bcaad515...1f229ebe ------------- Changes: https://git.openjdk.org/jdk/pull/16606/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=11 Stats: 846 lines in 13 files changed: 803 ins; 7 del; 36 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From aboldtch at openjdk.org Tue Jan 23 14:41:26 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:41:26 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v9] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> Message-ID: On Tue, 23 Jan 2024 13:53:26 GMT, Coleen Phillimore wrote: > Is the lock definitely never inflated due to hash code in this case now? It never inflates a locked object. The hash code race that may re-inflate a monitor is with deflation. Where it tries to install a hash code on an inflated monitor, but the monitor gets deflated between the two reads of the markWord, the thread that tries to install the hash code will create a new ObjectMonitor. [JDK-8323724](https://bugs.openjdk.org/browse/JDK-8323724) RFE is created for that issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463380891 From rriggs at openjdk.org Tue Jan 23 14:45:25 2024 From: rriggs at openjdk.org (Roger Riggs) Date: Tue, 23 Jan 2024 14:45:25 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:48:23 GMT, Aleksey Shipilev wrote: > Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. > > I provisionally call this flag `external-dep`, but I am open for other suggestions. > > Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. > > Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. > > Additional testing: > - [x] `make test TEST=applications/` fails > - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests Is there any place to document the new keyword or its usage; it does not seem very discoverable just existing in the TEST.ROOT and some tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1906198126 From aboldtch at openjdk.org Tue Jan 23 14:49:57 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:49:57 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v10] In-Reply-To: References: Message-ID: > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision: - Remove outdated anonymous owner fix in stub - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Remove C2HandleAnonOMOwnerStub definitions on x86. - Add MFENCE comment - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - top load adjustments - ... and 6 more: https://git.openjdk.org/jdk/compare/0915778d...518a2434 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/2c709241..518a2434 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=08-09 Stats: 12649 lines in 402 files changed: 5716 ins; 5379 del; 1554 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From aboldtch at openjdk.org Tue Jan 23 14:49:58 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:49:58 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v10] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> Message-ID: <68Ii_p0jXNAzQVU0N_ex9AReJtAanM66t6OcuebRRXo=.35b346c5-ea36-4247-804b-6b606a6fbab5@github.com> On Tue, 23 Jan 2024 14:39:05 GMT, Axel Boldt-Christmas wrote: >> Ok now I see. We've popped off the only entry of the lock stack (not recursive), but if some other thread has inflated the monitor due to contention, we'll fail this CAS. >> >> >> // Try to unlock. Transition lock bits 0b00 => 0b01 >> movptr(reg_rax, mark); >> andptr(reg_rax, ~(int32_t)markWord::lock_mask); >> orptr(mark, markWord::unlocked_value); >> lock(); cmpxchgptr(mark, Address(obj, oopDesc::mark_offset_in_bytes())); >> jcc(Assembler::notEqual, push_and_slow_path); >> jmp(unlocked); >> >> >> The alternative you're suggesting is to instead of pushing back on the lock-stack, go to check_successor (which will be non-null due to the contention) and the handoff will be faster that way. Then we have to restore the owner thread. Is the lock definitely never inflated due to hash code in this case now? >> >> Let's explain this in another RFE and we can see if benchmark results support this optimization. Pushing back on the lock stack and going slow path is simpler (less assembly code) and matches the interpreter/c1 case. But let's see after this is checked in. Your assert with some adjustment to the comment looks good. I was trying to remove the concept of anonymous owner from the assembly code because of another change I was working on. Thanks. > >> Is the lock definitely never inflated due to hash code in this case now? > > It never inflates a locked object. The hash code race that may re-inflate a monitor is with deflation. Where it tries to install a hash code on an inflated monitor, but the monitor gets deflated between the two reads of the markWord, the thread that tries to install the hash code will create a new ObjectMonitor. [JDK-8323724](https://bugs.openjdk.org/browse/JDK-8323724) RFE is created for that issue. Hmm now that I think about it can still happen that it inflates a fast locked lock. If a third thread enters at the same time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463385676 From aboldtch at openjdk.org Tue Jan 23 14:49:59 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:49:59 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v10] In-Reply-To: <68Ii_p0jXNAzQVU0N_ex9AReJtAanM66t6OcuebRRXo=.35b346c5-ea36-4247-804b-6b606a6fbab5@github.com> References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> <68Ii_p0jXNAzQVU0N_ex9AReJtAanM66t6OcuebRRXo=.35b346c5-ea36-4247-804b-6b606a6fbab5@github.com> Message-ID: On Tue, 23 Jan 2024 14:42:14 GMT, Axel Boldt-Christmas wrote: >>> Is the lock definitely never inflated due to hash code in this case now? >> >> It never inflates a locked object. The hash code race that may re-inflate a monitor is with deflation. Where it tries to install a hash code on an inflated monitor, but the monitor gets deflated between the two reads of the markWord, the thread that tries to install the hash code will create a new ObjectMonitor. [JDK-8323724](https://bugs.openjdk.org/browse/JDK-8323724) RFE is created for that issue. > > Hmm now that I think about it can still happen that it inflates a fast locked lock. If a third thread enters at the same time. But regardless all paths which commit to a pop also fixes the owner. So we do not need the anonymous owner fix in fast_unlock. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463386947 From aboldtch at openjdk.org Tue Jan 23 14:50:00 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:50:00 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v10] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> <8XWKuuqgQT9KsbJjVKSANKayHN1Ns7FEYcQisG1taF0=.4a574efd-0cd2-497e-938f-8ea621515bf6@github.com> <68Ii_p0jXNAzQVU0N_ex9AReJtAanM66t6OcuebRRXo=.35b346c5-ea36-4247-804b-6b606a6fbab5@github.com> Message-ID: <7v69_8hIhpDInYau13kmnx9X5z2rf0y-lMNdGXP4llc=.5c94a8f6-ac6e-4351-9347-bbf3f50a23b4@github.com> On Tue, 23 Jan 2024 14:43:04 GMT, Axel Boldt-Christmas wrote: >> Hmm now that I think about it can still happen that it inflates a fast locked lock. If a third thread enters at the same time. > > But regardless all paths which commit to a pop also fixes the owner. So we do not need the anonymous owner fix in fast_unlock. I removed the comment and the anonymous owner fix. Did not add an ASSERT check as the runtime asserts the same thing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463392032 From tschatzl at openjdk.org Tue Jan 23 14:50:30 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 Jan 2024 14:50:30 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Fwiw, the change makes class unloading regress significantly in a class unloading stress test (unloading 60k classes), seemingly tripling the time it takes for the "Purge Unlinked NMethods" phase (~20ms -> ~60ms). This may not be a problem for the concurrent gcs, but can be for the STW ones. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1906208799 From aboldtch at openjdk.org Tue Jan 23 14:55:42 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 14:55:42 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v11] In-Reply-To: References: Message-ID: > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Add more expressive stub continuation names ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/518a2434..ed71d016 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=09-10 Stats: 8 lines in 3 files changed: 1 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From iwalulya at openjdk.org Tue Jan 23 15:06:26 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 23 Jan 2024 15:06:26 GMT Subject: RFR: 8324512: Serial: Remove Generation::Name In-Reply-To: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> References: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> Message-ID: On Tue, 23 Jan 2024 10:20:46 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17530#pullrequestreview-1839014944 From duke at openjdk.org Tue Jan 23 15:28:44 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 23 Jan 2024 15:28:44 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v2] In-Reply-To: References: Message-ID: <1Wq3zi8Bd8Cvf-vBcClrHnFFY5fwMqB9qgJ8-a4G3ek=.22d22ceb-5cc4-4189-abc8-87ea93b9f00a@github.com> > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s kuaiwei has updated the pull request incrementally with one additional commit since the last revision: fix for review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17511/files - new: https://git.openjdk.org/jdk/pull/17511/files/f817bb25..e4081bc9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=00-01 Stats: 5 lines in 2 files changed: 1 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17511.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17511/head:pull/17511 PR: https://git.openjdk.org/jdk/pull/17511 From shade at openjdk.org Tue Jan 23 15:38:27 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 15:38:27 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 14:42:20 GMT, Roger Riggs wrote: > Is there any place to document the new keyword or its usage; it does not seem very discoverable just existing in the TEST.ROOT and some tests. I don't think there is a place to describe keywords, except in the relevant `TEST.ROOT`-s. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1906324934 From duke at openjdk.org Tue Jan 23 15:38:30 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 23 Jan 2024 15:38:30 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v2] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 10:13:05 GMT, Andrew Haley wrote: >> kuaiwei has updated the pull request incrementally with one additional commit since the last revision: >> >> fix for review comments > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2069: > >> 2067: if (last != nullptr && nativeInstruction_at(last)->is_Membar() && prev == last) { >> 2068: NativeMembar *bar = NativeMembar_at(prev); >> 2069: // We need avoid promoting barrier to dmb.ish, > > Suggestion: > > // Don't promote DMB ST|DMB LD to DMB (a full barrier) because > // doing so would introduce a StoreLoad which the caller did not > // intend. > > I think that should be clear enough. Thanks for comments. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1463469822 From duke at openjdk.org Tue Jan 23 15:38:33 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 23 Jan 2024 15:38:33 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v2] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 16:20:30 GMT, Andrey Turbanov wrote: >> kuaiwei has updated the pull request incrementally with one additional commit since the last revision: >> >> fix for review comments > > test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java line 70: > >> 68: public TObj() { >> 69: i = 10; >> 70: l = 100l; > > Suggestion: > > l = 100L; Done > test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java line 82: > >> 80: public TObjWithFinal() { >> 81: i = 10; >> 82: l = 100l; > > Suggestion: > > l = 100L; Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1463470365 PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1463470763 From kbarrett at openjdk.org Tue Jan 23 15:42:27 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 15:42:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 11:18:43 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > add more overloads Not a review, just a drive-by comment. >From the description for this PR: "This is inspired by std::is_constant_evaluated in C++." I think what is being proposed here is more like gcc's `__builtin_constant_p`. std::is_constant_evaluated is a different thing, used to detect evaluation in a manifestly constexpr context. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1906337427 From eosterlund at openjdk.org Tue Jan 23 15:43:35 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 23 Jan 2024 15:43:35 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: <5v8UsTv9MOdMgWSsHWGLEXk71SQohaIRVPdJpqhVeYE=.de7e22a9-d5e3-49fe-a35f-35934b5231b0@github.com> On Tue, 23 Jan 2024 14:47:49 GMT, Thomas Schatzl wrote: > Fwiw, the change makes class unloading regress significantly in a class unloading stress test (unloading 60k classes), seemingly tripling the time it takes for the "Purge Unlinked NMethods" phase (~20ms -> ~60ms). > > This may not be a problem for the concurrent gcs, but can be for the STW ones. > > (Overall max G1 remark pause times went from 160ms to 220ms, regular Remark pauses which do class unloading from 120ms to 160ms). If the level of ~1 ms per 1000 unloaded classes worth of latency issue is crucial for G1, then maybe it's time for G1 to at least run purging concurrently? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1906338815 From shade at openjdk.org Tue Jan 23 15:43:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 15:43:35 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests All right, thanks! @lmesnik, I realized I forgot to ask if you had objections to this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17422#issuecomment-1906338893 From shade at openjdk.org Tue Jan 23 15:51:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 15:51:32 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 11:18:43 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > add more overloads All right, this is very close :) I now have stylistic comments: src/hotspot/share/classfile/vmIntrinsics.hpp line 912: > 910: do_intrinsic(_getAndSetInt, jdk_internal_misc_Unsafe, getAndSetInt_name, getAndSetInt_signature, F_R) \ > 911: do_name( getAndSetInt_name, "getAndSetInt") \ > 912: do_alias( getAndSetInt_signature, /*"(Ljava/lang/Object;JI)I"*/ getAndAddInt_signature) \ I don't think we need to do these formatting changes in this PR. src/hotspot/share/classfile/vmIntrinsics.hpp line 927: > 925: \ > 926: do_class(jdk_internal_misc_JitCompiler, "jdk/internal/misc/JitCompiler") \ > 927: do_intrinsic(_isConstantExpressionZ, jdk_internal_misc_JitCompiler,isConstantExpression_name, bool_bool_signature, F_S) \ It would be cleaner to follow the current naming for existing intrinsic: do_intrinsic(_isCompileConstant, java_lang_invoke_MethodHandleImpl, isCompileConstant_name, isCompileConstant_signature, F_S) \ do_name( isCompileConstant_name, "isCompileConstant") \ do_alias( isCompileConstant_signature, object_boolean_signature) \ I.e. rename `isConstantExpression` -> `isCompileConstant`. It clearly communicates that we are not dealing with expressions as arguments, and that we underline this is the (JIT) _compile_ constant, not just a constant expression from JLS 15.28 "Constant Expressions". Maybe even replace that `MHImpl` method with the new intrinsic. src/hotspot/share/opto/c2compiler.cpp line 727: > 725: case vmIntrinsics::_storeStoreFence: > 726: case vmIntrinsics::_fullFence: > 727: case vmIntrinsics::_isConstantExpressionZ: Move this closer to `vmIntrinsics::_isCompileConstant:`, if not outright replace it? src/hotspot/share/opto/library_call.hpp line 2: > 1: /* > 2: * Copyright (c) 2020, 2024, Oracle and/or its affiliates. All rights reserved. Unnecessary update? ------------- PR Review: https://git.openjdk.org/jdk/pull/17527#pullrequestreview-1839148507 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463490470 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463493124 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463497227 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463497518 From alanb at openjdk.org Tue Jan 23 15:55:29 2024 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 23 Jan 2024 15:55:29 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: Message-ID: <02_Q7SYNI7MYYOeNsq1xGPsOY502JbeXfJyvUGZTtZg=.8a6dcf5c-4dfd-4688-97c1-95497b637cd3@github.com> On Tue, 23 Jan 2024 11:18:43 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > add more overloads Would it be possible to list further examples where this might be used? Asking because I'm wondering about the usability and maintainability of if-then-else code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1906362127 From mdoerr at openjdk.org Tue Jan 23 15:58:28 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 23 Jan 2024 15:58:28 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Do we know what makes that slower? Is it the glibc overhead of `delete ic->data();`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1906368815 From shade at openjdk.org Tue Jan 23 16:03:29 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 16:03:29 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: <02_Q7SYNI7MYYOeNsq1xGPsOY502JbeXfJyvUGZTtZg=.8a6dcf5c-4dfd-4688-97c1-95497b637cd3@github.com> References: <02_Q7SYNI7MYYOeNsq1xGPsOY502JbeXfJyvUGZTtZg=.8a6dcf5c-4dfd-4688-97c1-95497b637cd3@github.com> Message-ID: On Tue, 23 Jan 2024 15:52:29 GMT, Alan Bateman wrote: > Would it be possible to list further examples where this might be used? Asking because I'm wondering about the usability and maintainability of if-then-else code. A similar thing is already used in JDK: https://github.com/openjdk/jdk/blob/2a01c798d346656a0ee3553c0964feab75b5dfb6/src/java.base/share/classes/java/lang/invoke/Invokers.java#L622-L624 Extending this for more common use allows doing things like optimizing `Integer.toString(int)`: @Stable static final String[] CONST_STRINGS = {"-1", "0", "1"}; @IntrinsicCandidate public static String toString(int i) { if (isCompileConstant(i) && (i >= -1) && (i <= 1)) { return CONST_STRINGS[i + 1]; } ... ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1906379544 From aboldtch at openjdk.org Tue Jan 23 16:14:53 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 16:14:53 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v13] In-Reply-To: References: Message-ID: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Fix miss in is_recursive improvement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16606/files - new: https://git.openjdk.org/jdk/pull/16606/files/1f229ebe..ae2bfca3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From aboldtch at openjdk.org Tue Jan 23 16:24:48 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 16:24:48 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: Message-ID: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Add more expressive stub continuation names - Remove outdated anonymous owner fix in stub - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Remove C2HandleAnonOMOwnerStub definitions on x86. - Add MFENCE comment - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - ... and 8 more: https://git.openjdk.org/jdk/compare/c16295d4...bc214b8d ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/ed71d016..bc214b8d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=10-11 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From aboldtch at openjdk.org Tue Jan 23 16:30:48 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Jan 2024 16:30:48 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: Message-ID: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Switch to CAS over LXSX - Fix missing $ - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - 8319801: Recursive lightweight locking: aarch64 implementation - Cleanup: C2 fast_lock/fast_unlock aarch64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/8882cddc..8a7ebd0f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=04-05 Stats: 23886 lines in 620 files changed: 13085 ins; 8257 del; 2544 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From dnsimon at openjdk.org Tue Jan 23 16:47:41 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 23 Jan 2024 16:47:41 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader Message-ID: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. ------------- Commit messages: - load JVMCI with platform class loader Changes: https://git.openjdk.org/jdk/pull/17520/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17520&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323832 Stats: 227 lines in 8 files changed: 219 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17520.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17520/head:pull/17520 PR: https://git.openjdk.org/jdk/pull/17520 From dnsimon at openjdk.org Tue Jan 23 16:47:42 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 23 Jan 2024 16:47:42 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader In-Reply-To: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> Message-ID: On Mon, 22 Jan 2024 17:34:16 GMT, Doug Simon wrote: > This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. src/java.base/share/lib/security/default.policy line 166: > 164: }; > 165: > 166: grant codeBase "jrt:/jdk.internal.vm.ci" { This is required as JVMCI is no longer loaded by the boot loader but should retain all permissions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17520#discussion_r1463601925 From qamai at openjdk.org Tue Jan 23 16:51:50 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 16:51:50 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v4] In-Reply-To: References: Message-ID: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is inspired by `std::is_constant_evaluated` in C++. > > Please kindly give your opinion as well as your reviews, thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: address reviews: rename to isCompileConstant, remove duplication, revert unnecessary changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17527/files - new: https://git.openjdk.org/jdk/pull/17527/files/31403d6f..18f7d482 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=02-03 Stats: 92 lines in 8 files changed: 10 ins; 13 del; 69 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From qamai at openjdk.org Tue Jan 23 16:56:29 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 16:56:29 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 15:44:40 GMT, Aleksey Shipilev wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> add more overloads > > src/hotspot/share/classfile/vmIntrinsics.hpp line 927: > >> 925: \ >> 926: do_class(jdk_internal_misc_JitCompiler, "jdk/internal/misc/JitCompiler") \ >> 927: do_intrinsic(_isConstantExpressionZ, jdk_internal_misc_JitCompiler,isConstantExpression_name, bool_bool_signature, F_S) \ > > It would be cleaner to follow the current naming for existing intrinsic: > > > do_intrinsic(_isCompileConstant, java_lang_invoke_MethodHandleImpl, isCompileConstant_name, isCompileConstant_signature, F_S) \ > do_name( isCompileConstant_name, "isCompileConstant") \ > do_alias( isCompileConstant_signature, object_boolean_signature) \ > > > I.e. rename `isConstantExpression` -> `isCompileConstant`. It clearly communicates that we are not dealing with expressions as arguments, and that we underline this is the (JIT) _compile_ constant, not just a constant expression from JLS 15.28 "Constant Expressions". > > Maybe even replace that `MHImpl` method with the new intrinsic. Yes you are right, I have renamed it to `isCompileConstant`. > src/hotspot/share/opto/c2compiler.cpp line 727: > >> 725: case vmIntrinsics::_storeStoreFence: >> 726: case vmIntrinsics::_fullFence: >> 727: case vmIntrinsics::_isConstantExpressionZ: > > Move this closer to `vmIntrinsics::_isCompileConstant:`, if not outright replace it? I have replaced `MHImpl::isCompileConstant` with the new one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463617016 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463616039 From lmesnik at openjdk.org Tue Jan 23 16:59:28 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 23 Jan 2024 16:59:28 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17422#pullrequestreview-1839361504 From shade at openjdk.org Tue Jan 23 17:06:40 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 17:06:40 GMT Subject: Integrated: 8323515: Create test alias "all" for all test roots In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 11:05:09 GMT, Aleksey Shipilev wrote: > Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. > > Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. > > > % make test TEST=all > > Test selection 'all', will run: > * jtreg:test/hotspot/jtreg:all > * jtreg:test/jdk:all > * jtreg:test/langtools:all > * jtreg:test/jaxp:all > * jtreg:test/lib-test:all > > (...about 6 hours later...) > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR >>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>> jtreg:test/jdk:all 9962 9951 11 0 << > jtreg:test/langtools:all 4469 4469 0 0 > jtreg:test/jaxp:all 513 513 0 0 > jtreg:test/lib-test:all 32 32 0 0 > ============================== > TEST FAILURE This pull request has now been integrated. Changeset: 8b9bf758 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/8b9bf758801400e4491326cd4c90fc117b9d97e1 Stats: 49 lines in 5 files changed: 42 ins; 5 del; 2 mod 8323515: Create test alias "all" for all test roots Reviewed-by: dholmes, alanb, joehw, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/17422 From shade at openjdk.org Tue Jan 23 17:06:38 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 17:06:38 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 16 Jan 2024 09:01:35 GMT, Aleksey Shipilev wrote: >> Since recent work to improve `tier4` performance, we actually test `tier{1,2,3,4}` often, which includes all the tests in current tree. It would be more convenient to just have the `all` test group/alias, so that we can do `make test TEST=all`. This also gives a parallelism / run time benefit, as we do not wait for tests in each tier to complete before moving to next tier. >> >> Sample run on out-of-the-box Linux x86_64 fastdebug is below. For some environments one also needs to supply a few keywords like `!printer` to skip tests that cannot complete without failure due to misconfiguration. I left the keywords as is to show how would a failing run look. There is also an existing shortcut in build system that allows to run this with `make test-all`. >> >> >> % make test TEST=all >> >> Test selection 'all', will run: >> * jtreg:test/hotspot/jtreg:all >> * jtreg:test/jdk:all >> * jtreg:test/langtools:all >> * jtreg:test/jaxp:all >> * jtreg:test/lib-test:all >> >> (...about 6 hours later...) >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >>>> jtreg:test/hotspot/jtreg:all 6731 6702 29 0 << >>>> jtreg:test/jdk:all 9962 9951 11 0 << >> jtreg:test/langtools:all 4469 4469 0 0 >> jtreg:test/jaxp:all 513 513 0 0 >> jtreg:test/lib-test:all 32 32 0 0 >> ============================== >> TEST FAILURE > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Catch-all -> All tests Thank you all! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17422#issuecomment-1906520760 From duke at openjdk.org Tue Jan 23 17:20:29 2024 From: duke at openjdk.org (xxDark) Date: Tue, 23 Jan 2024 17:20:29 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader In-Reply-To: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> Message-ID: On Mon, 22 Jan 2024 17:34:16 GMT, Doug Simon wrote: > This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. Hello. I'm not a reviewer but I read through the conversation in JIRA and saw this comment: > [~pwoegerer] currently has a Native Image patch where he creates a URLClassLoader whose parent is jdk.internal.loader.ClassLoaders.BOOT_LOADER (retrieved via reflection and use of required --add-exports and --add-opens command line options). That is, he's using the non-delegating approach you mention. There is zero reason to do this. Passing `null` as parent class loader would suffice as boot loader just uses `findBootstrapClassOrNull` in `JavaLangAccess` either way. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1906515587 From qamai at openjdk.org Tue Jan 23 17:21:47 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 17:21:47 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. > > Please kindly give your opinion as well as your reviews, thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: ident ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17527/files - new: https://git.openjdk.org/jdk/pull/17527/files/18f7d482..3ecb2c66 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From qamai at openjdk.org Tue Jan 23 17:21:49 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 17:21:49 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: <02_Q7SYNI7MYYOeNsq1xGPsOY502JbeXfJyvUGZTtZg=.8a6dcf5c-4dfd-4688-97c1-95497b637cd3@github.com> Message-ID: On Tue, 23 Jan 2024 16:01:05 GMT, Aleksey Shipilev wrote: >> Would it be possible to list further examples where this might be used? Asking because I'm wondering about the usability and maintainability of if-then-else code. > >> Would it be possible to list further examples where this might be used? Asking because I'm wondering about the usability and maintainability of if-then-else code. > > A similar thing is already used in JDK: https://github.com/openjdk/jdk/blob/2a01c798d346656a0ee3553c0964feab75b5dfb6/src/java.base/share/classes/java/lang/invoke/Invokers.java#L622-L624 > > Extending this for more common use allows doing things like optimizing `Integer.toString(int)`: > > > @Stable > static final String[] CONST_STRINGS = {"-1", "0", "1"}; > > @IntrinsicCandidate > public static String toString(int i) { > if (isCompileConstant(i) && (i >= -1) && (i <= 1)) { > return CONST_STRINGS[i + 1]; > } > ... > > > Note how this code would fold away to one of the paths, depending on whether the compiler knows it is a constant or not. Generated-code-wise it is a zero-cost thing :) @shipilev Thanks a lot for the detailed reviews and suggestions, I hope I have addressed all of them. @kimbarrett TIL about that builtin, updated the PR description to mention that instead. Thanks very much. @AlanBateman Another potential usage I mentioned in the JBS issue is that `GlobalSession` is noncloseable, but there is no way for the accessor to know that without doing a checkcast. Using this we can eliminate the check if the session is statically known to be a global session without imposing additional checks on other kinds of memory sessions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1906549618 From cjplummer at openjdk.org Tue Jan 23 17:46:27 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 23 Jan 2024 17:46:27 GMT Subject: RFR: 8324512: Serial: Remove Generation::Name In-Reply-To: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> References: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> Message-ID: On Tue, 23 Jan 2024 10:20:46 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. SA changes look good to me. Copyrights still need updating in a few files. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17530#pullrequestreview-1839481177 From shade at openjdk.org Tue Jan 23 17:50:31 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 17:50:31 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 17:21:47 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > ident A few more stylistic comments :) Still thinking the better home for these might be just `jdk.internal.misc.VM`... But I would not insist, if others are happy. src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 56: > 54: */ > 55: @IntrinsicCandidate > 56: public static boolean isCompileConstant(boolean expr) { Here and in other places: probably not `expr`, but just `val` or something? src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 119: > 117: * @see #isCompileConstant(boolean) > 118: */ > 119: @IntrinsicCandidate Note how the Java entry for MH intrinsic we have replaced had `@Hidden`. These methods should have `@Hidden` too then? Probably applies to other entries too. ------------- PR Review: https://git.openjdk.org/jdk/pull/17527#pullrequestreview-1839475907 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463705907 PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463703771 From shade at openjdk.org Tue Jan 23 17:50:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 17:50:33 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v3] In-Reply-To: References: <02_Q7SYNI7MYYOeNsq1xGPsOY502JbeXfJyvUGZTtZg=.8a6dcf5c-4dfd-4688-97c1-95497b637cd3@github.com> Message-ID: On Tue, 23 Jan 2024 16:01:05 GMT, Aleksey Shipilev wrote: >> Would it be possible to list further examples where this might be used? Asking because I'm wondering about the usability and maintainability of if-then-else code. > >> Would it be possible to list further examples where this might be used? Asking because I'm wondering about the usability and maintainability of if-then-else code. > > A similar thing is already used in JDK: https://github.com/openjdk/jdk/blob/2a01c798d346656a0ee3553c0964feab75b5dfb6/src/java.base/share/classes/java/lang/invoke/Invokers.java#L622-L624 > > Extending this for more common use allows doing things like optimizing `Integer.toString(int)`: > > > @Stable > static final String[] CONST_STRINGS = {"-1", "0", "1"}; > > @IntrinsicCandidate > public static String toString(int i) { > if (isCompileConstant(i) && (i >= -1) && (i <= 1)) { > return CONST_STRINGS[i + 1]; > } > ... > > > Note how this code would fold away to one of the paths, depending on whether the compiler knows it is a constant or not. Generated-code-wise it is a zero-cost thing :) > @shipilev Thanks a lot for the detailed reviews and suggestions, I hope I have addressed all of them. Sure thing, I just effectively merged my draft implementation into yours :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1906602556 From ayang at openjdk.org Tue Jan 23 18:37:45 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Jan 2024 18:37:45 GMT Subject: RFR: 8324512: Serial: Remove Generation::Name [v2] In-Reply-To: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> References: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> Message-ID: <81gkky88Q9P2sYNK2Rneht66RQ-7aXj4Bcc0gbTVqvo=.8687ed7d-1b81-411e-bb28-364c4518e6be@github.com> > Trivial removing redundant code. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17530/files - new: https://git.openjdk.org/jdk/pull/17530/files/fd278ec6..6fc1a6c3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17530&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17530&range=00-01 Stats: 6 lines in 6 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/17530.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17530/head:pull/17530 PR: https://git.openjdk.org/jdk/pull/17530 From kbarrett at openjdk.org Tue Jan 23 19:00:43 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 19:00:43 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() [v2] In-Reply-To: References: Message-ID: <32d_CojTj7M_Ud6r-OGsnNWthvEYrh1I6PaOgosTQKc=.a9b90f0d-5129-4ab7-9b81-a65869c3bb05@github.com> > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: tidy CLD remove_handle ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17510/files - new: https://git.openjdk.org/jdk/pull/17510/files/f387b8f3..19905134 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17510&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17510&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17510.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17510/head:pull/17510 PR: https://git.openjdk.org/jdk/pull/17510 From kbarrett at openjdk.org Tue Jan 23 19:03:29 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 23 Jan 2024 19:03:29 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() [v2] In-Reply-To: References: <19qGKlnx8PjSXe_r4KO_Pr1-Ya1WuFvZghjjAUJFimo=.38862996-b5e5-4879-8703-fe44ce9f9cdf@github.com> Message-ID: On Tue, 23 Jan 2024 11:22:12 GMT, Kim Barrett wrote: > > Looks reasonable. > > I guess the use in `ClassLoaderData::remove_handle` is fine, because we want to assert it? > > I forgot about this one. I initially skipped it because I didn't understand what was happening here. I'd forgotten that the CLD uses OopHandle to wrap a pointer to the CLD handle area rather than an OopStorage entry. I'll push a fix after I've run tests. Non-assert code now just uses is_empty. However, the assert has to use ptr_raw, unfortunately. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1906735569 From coleenp at openjdk.org Tue Jan 23 19:03:33 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Jan 2024 19:03:33 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Tue, 23 Jan 2024 16:24:48 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Add more expressive stub continuation names > - Remove outdated anonymous owner fix in stub > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Remove C2HandleAnonOMOwnerStub definitions on x86. > - Add MFENCE comment > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - ... and 8 more: https://git.openjdk.org/jdk/compare/368f0d0b...bc214b8d Thank you for addressing all of my comments and questions. This looks like a good improvement and good code to have to continue to address LM_LIGHTWEIGHT locking performance. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 77: > 75: > 76: int C2FastUnlockLightweightStub::max_size() const { > 77: return 128; Is this still 128? ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1839691375 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463852127 From coleenp at openjdk.org Tue Jan 23 19:03:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Jan 2024 19:03:34 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <6D_SRzqpDQ21WiSYCN77xc1EiZ-GVf4IdgJCbdvURAE=.d9cb5838-6f93-4629-a053-bd12c5a349c5@github.com> Message-ID: On Fri, 19 Jan 2024 22:00:04 GMT, Coleen Phillimore wrote: >> It is a little bit awkward that they both have to restore the held monitor count. > > Ok, I've looked at the control flow more, which is not simple. Both have the slow path continuation() exit, and only the exiting with a successor has the unlocked exit. Can you rename continuation() to slow_path() because continuation doesn't help. Thanks for renaming continuation function. I didn't see that the name comes from the C2CodeStub. this looks better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463849018 From dnsimon at openjdk.org Tue Jan 23 19:16:49 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 23 Jan 2024 19:16:49 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> Message-ID: <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> > This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: use null to denote boot class loader as delegation parent ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17520/files - new: https://git.openjdk.org/jdk/pull/17520/files/e7d5801a..1642276e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17520&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17520&range=00-01 Stats: 8 lines in 1 file changed: 1 ins; 7 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17520.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17520/head:pull/17520 PR: https://git.openjdk.org/jdk/pull/17520 From dnsimon at openjdk.org Tue Jan 23 19:16:50 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 23 Jan 2024 19:16:50 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> Message-ID: On Tue, 23 Jan 2024 17:00:20 GMT, xxDark wrote: > Passing `null` as parent class loader would suffice as boot loader just uses `findBootstrapClassOrNull` in `JavaLangAccess` either way Thanks! I've simplified the test accordingly: 1642276ea22a5d789e01a5ecb1059d8c5c8be284 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1906753878 From shade at openjdk.org Tue Jan 23 19:50:26 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jan 2024 19:50:26 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() [v2] In-Reply-To: <32d_CojTj7M_Ud6r-OGsnNWthvEYrh1I6PaOgosTQKc=.a9b90f0d-5129-4ab7-9b81-a65869c3bb05@github.com> References: <32d_CojTj7M_Ud6r-OGsnNWthvEYrh1I6PaOgosTQKc=.a9b90f0d-5129-4ab7-9b81-a65869c3bb05@github.com> Message-ID: On Tue, 23 Jan 2024 19:00:43 GMT, Kim Barrett wrote: >> Please review this change to use OopHandle::is_empty() rather than comparing >> the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former >> is the intended API for such checks. ptr_raw should only be used directly >> where it is actually needed. >> >> Testing: mach5 tier1. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > tidy CLD remove_handle Looks okay. Pity we were not able to eliminate the `ptr_raw` use completely. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17510#pullrequestreview-1839794218 From psandoz at openjdk.org Tue Jan 23 20:04:29 2024 From: psandoz at openjdk.org (Paul Sandoz) Date: Tue, 23 Jan 2024 20:04:29 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 17:21:47 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > ident src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 32: > 30: * Just-in-time-compiler-related queries > 31: */ > 32: public class JitCompiler { An alternative name and location is `jdk.internal.vm.ConstantSupport` with initial class doc: Defines methods to test if a value has been evaluated to a compile-time constant value by the HotSpot VM. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1463926393 From eastigeevich at openjdk.org Tue Jan 23 21:43:27 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 23 Jan 2024 21:43:27 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 274: > 272: (cache_size - non_nmethod.size) / 2 : min_size; > 273: } > 274: Please add a check: ` non_nmethod.size < cache_size` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1464028372 From eastigeevich at openjdk.org Tue Jan 23 22:06:30 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 23 Jan 2024 22:06:30 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Thu, 18 Jan 2024 17:08:29 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > apply suggestions src/hotspot/share/code/codeCache.cpp line 281: > 279: if (!profiled.set && non_profiled.set) { > 280: profiled.size = subtract_size(cache_size, non_nmethod.size + non_profiled.size, min_size); > 281: } `subtract_size` can do subtracting or nothing by return `min_size`. This is not obvious from its name. What about a function: // Precondition: either or both of heaps must be set. // // If either of heaps size is not set, its size is set to max(available_size - set_heap.size, min_size). static void set_size_of_unset_code_heap(CodeHeapInfo *heap1, CodeHeapInfo *heap2, size_t available_size, size_t min_size) { assert(...); //check precondition if (heap1->set && heap2->set) return; if (!heap2->set) swap(heap1, heap2); heap1->size = (available_size > heap2->size + min_size) ? (available_size - heap2->size) : min_size; } Now we can unite two IFs into `else` case of `if (!profiled.set && !non_profiled.set)`: if (!profiled.set && !non_profiled.set) { ... } else { set_size_of_unset_code_heap(&profiled, &non_profiled, cache_size - non_profiled.size, min_size) } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1464049834 From qamai at openjdk.org Tue Jan 23 22:44:28 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 22:44:28 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 17:40:52 GMT, Aleksey Shipilev wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> ident > > src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 119: > >> 117: * @see #isCompileConstant(boolean) >> 118: */ >> 119: @IntrinsicCandidate > > Note how the Java entry for MH intrinsic we have replaced had `@Hidden`. These methods should have `@Hidden` too then? Probably applies to other entries too. I don't understand why this needs to be `@Hidden`, the javadoc says that a function annotated with `@Hidden` is omitted from the stacktraces. This function does not call anything so what is the point of hiding it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464081953 From qamai at openjdk.org Tue Jan 23 22:49:28 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 22:49:28 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> On Tue, 23 Jan 2024 17:42:40 GMT, Aleksey Shipilev wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> ident > > src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 56: > >> 54: */ >> 55: @IntrinsicCandidate >> 56: public static boolean isCompileConstant(boolean expr) { > > Here and in other places: probably not `expr`, but just `val` or something? I think of this as an expression that is always evaluated to the same value. The value itself is not interesting, it is the set of values that this expression can take that we are talking about. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464085126 From qamai at openjdk.org Tue Jan 23 22:52:27 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 23 Jan 2024 22:52:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: <81tjQoutCZRej3wZAnPDJIq31hz7D7tbiJLWyWpXXv0=.5786bc11-7aa2-4290-a1d5-37c82452ed41@github.com> On Tue, 23 Jan 2024 20:01:45 GMT, Paul Sandoz wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> ident > > src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 32: > >> 30: * Just-in-time-compiler-related queries >> 31: */ >> 32: public class JitCompiler { > > An alternative name and location is `jdk.internal.vm.ConstantSupport` with initial class doc: > > Defines methods to test if a value has been evaluated to a compile-time constant value by the HotSpot VM. That sounds like a better name for the class, although I think `jdk.internal.misc` is more suitable than `jdk.internal.vm`. Do you have any preference? Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464087772 From mdoerr at openjdk.org Wed Jan 24 04:00:35 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 24 Jan 2024 04:00:35 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v9] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Tue, 16 Jan 2024 08:36:49 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with three additional commits since the last revision: > > - Update porting_aix.cpp > - Update porting_aix.cpp > - Update os_aix.cpp Regarding `libclang.a`, Joachim told me that we would need to load `libclang.a(libclang.so.16)`. So that's a different issue which relates to `System.loadLibrary(libName)` (https://bugs.openjdk.org/browse/JDK-8319516) and not to this hotspot internal issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1907311369 From dholmes at openjdk.org Wed Jan 24 06:14:27 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Jan 2024 06:14:27 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Tue, 23 Jan 2024 19:16:49 GMT, Doug Simon wrote: >> This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use null to denote boot class loader as delegation parent The actual changes to load JVMCI via the platform loader seem fine. I'm still puzzled by the need to do this as any non-delegating classloader would have allowed this even if JVMCI were loaded by the bootloader. And I don't understand how your test is working as it is using a delegating classloader. test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java line 54: > 52: > 53: ClassLoader pcl = ClassLoader.getPlatformClassLoader(); > 54: URLClassLoader ucl = new URLClassLoader(cp, null); I am missing something here, a `URLClassLoader` first delegates to its parent before searching its URLs, so how does this not find the platform loader versions of the JVMCI classes? ------------- PR Review: https://git.openjdk.org/jdk/pull/17520#pullrequestreview-1840477028 PR Review Comment: https://git.openjdk.org/jdk/pull/17520#discussion_r1464348710 From dholmes at openjdk.org Wed Jan 24 06:30:27 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Jan 2024 06:30:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> References: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> Message-ID: <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> On Tue, 23 Jan 2024 22:46:20 GMT, Quan Anh Mai wrote: >> src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 56: >> >>> 54: */ >>> 55: @IntrinsicCandidate >>> 56: public static boolean isCompileConstant(boolean expr) { >> >> Here and in other places: probably not `expr`, but just `val` or something? > > I think of this as an expression that is always evaluated to the same value. The value itself is not interesting, it is the set of values that this expression can take that we are talking about. This seems really weird to me for Java code. The method doesn't get the original "expression" it only gets the value of that expression after it has been evaluated. Is there some kind of weird "magic" happening here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464361310 From qamai at openjdk.org Wed Jan 24 07:17:27 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Wed, 24 Jan 2024 07:17:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> References: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> Message-ID: <9iDFu8I4w_i1Uso5q7oEi0Le1JvgDNgNyuSZlmKQiuE=.5739d448-fc73-4bcf-bec8-26b3a1b75d21@github.com> On Wed, 24 Jan 2024 06:27:20 GMT, David Holmes wrote: >> I think of this as an expression that is always evaluated to the same value. The value itself is not interesting, it is the set of values that this expression can take that we are talking about. > > This seems really weird to me for Java code. The method doesn't get the original "expression" it only gets the value of that expression after it has been evaluated. Is there some kind of weird "magic" happening here? @dholmes-ora Indeed it's a compiler magic, albeit not really weird. While the method execution only receives the evaluated value of `expr`, the method compilation has the expression in its original form. As a result, it can determine the result based on this information. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464415357 From stuefe at openjdk.org Wed Jan 24 07:33:32 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 24 Jan 2024 07:33:32 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v9] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Tue, 16 Jan 2024 08:36:49 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with three additional commits since the last revision: > > - Update porting_aix.cpp > - Update porting_aix.cpp > - Update os_aix.cpp For me the unresolved question is still: - do we want an unconditional load of *.a for a given *.so (have yet to see any documentation for this a-file duality) - if we do, do we want that to be bidirectional? Someone specifies *.a, do we want to attempt to load *.so? When in doubt, we should just mimic what OpenJ9 is doing on AIX. But I would like a clear documentation as a comment in os_aix.cpp explaining the logic and referencing the relevant OpenJ9 files. There are some minor issues with the code itself, but I will defer reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1907550400 From jwaters at openjdk.org Wed Jan 24 08:04:50 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 24 Jan 2024 08:04:50 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION Message-ID: Please review a portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION: Currently, FORBID_C_FUNCTION only works for gcc like compilers, and ALLOW_C_FUNCTION acts to disable CRT warnings on Windows, where FORBID_C_FUNCTION does not work. It would be beneficial to provide a universal portable definition for both, to allow the macros to work on all platforms HotSpot can be compiled for. The implementation is portable and _should_ work on all HotSpot supported platforms (I don't have an AIX device!). Regrettably, I did end up having to change the signature of ALLOW_C_FUNCTION to work with this new implementation, as well as the way it is used. On one hand, it is more compact than before, but on the other the established syntax is likely more familiar by this point. I do hope this is not a showstopper, but understand if it is ------------- Commit messages: - compilerWarnings.hpp - compilerWarnings.hpp - logTagSet.cpp - nmtPreInit.cpp - gtestMain.cpp - os.cpp - memMapPrinter.cpp - mallocSiteTable.cpp - logTagSet.cpp - jvmciEnv.cpp - ... and 33 more: https://git.openjdk.org/jdk/compare/16be3888...70cfecd1 Changes: https://git.openjdk.org/jdk/pull/17387/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17387&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313396 Stats: 111 lines in 19 files changed: 0 ins; 64 del; 47 mod Patch: https://git.openjdk.org/jdk/pull/17387.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17387/head:pull/17387 PR: https://git.openjdk.org/jdk/pull/17387 From jwaters at openjdk.org Wed Jan 24 08:04:51 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 24 Jan 2024 08:04:51 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION In-Reply-To: References: Message-ID: On Fri, 12 Jan 2024 06:16:25 GMT, Julian Waters wrote: > Please review a portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION: > > Currently, FORBID_C_FUNCTION only works for gcc like compilers, and ALLOW_C_FUNCTION acts to disable CRT warnings on Windows, where FORBID_C_FUNCTION does not work. It would be beneficial to provide a universal portable definition for both, to allow the macros to work on all platforms HotSpot can be compiled for. > > The implementation is portable and _should_ work on all HotSpot supported platforms (I don't have an AIX device!). > > Regrettably, I did end up having to change the signature of ALLOW_C_FUNCTION to work with this new implementation, as well as the way it is used. On one hand, it is more compact than before, but on the other the established syntax is likely more familiar by this point. I do hope this is not a showstopper, but understand if it is @kimbarrett This isn't ready for review yet, but I think I might need your help, how was your original approach able to avoid firing warnings and errors when use of forbidden methods was encountered in third party code, such as GTest? Looking at the logs mine is failing because the forbidden methods are being detected in GTest code as invalid Opening to review, because I really need help with this ------------- PR Comment: https://git.openjdk.org/jdk/pull/17387#issuecomment-1902512575 PR Comment: https://git.openjdk.org/jdk/pull/17387#issuecomment-1907588682 From dnsimon at openjdk.org Wed Jan 24 08:48:29 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Jan 2024 08:48:29 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: <_FCKvxLQxWZCfs-1Rxjr3qtcwMomMA368p_-6aJeTPQ=.db4d0b6f-205b-4127-b040-740e476e0919@github.com> On Wed, 24 Jan 2024 06:07:55 GMT, David Holmes wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> use null to denote boot class loader as delegation parent > > test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java line 54: > >> 52: >> 53: ClassLoader pcl = ClassLoader.getPlatformClassLoader(); >> 54: URLClassLoader ucl = new URLClassLoader(cp, null); > > I am missing something here, a `URLClassLoader` first delegates to its parent before searching its URLs, so how does this not find the platform loader versions of the JVMCI classes? With `new URLClassLoader(cp, null)`, the URL loader delegates directly to the boot loader, by-passing the platform loader. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17520#discussion_r1464529290 From dnsimon at openjdk.org Wed Jan 24 08:58:26 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Jan 2024 08:58:26 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Wed, 24 Jan 2024 06:11:30 GMT, David Holmes wrote: > I'm still puzzled by the need to do this as any non-delegating classloader would have allowed this even if JVMCI were loaded by the bootloader. As far as I understand, even a non-delegating classloader cannot redefine a class loaded by the boot loader. I modified the test to show this and get: java.lang.LinkageError: loader LoadAlternativeJVMCI$1 @4a1f4d08 attempted duplicate class definition for jdk.vm.ci.meta.ResolvedJavaType. (jdk.vm.ci.meta.ResolvedJavaType is in unnamed module of loader LoadAlternativeJVMCI$1 @4a1f4d08, parent loader 'bootstrap') at java.base/java.lang.ClassLoader.defineClass1(Native Method) at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1023) at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) at java.base/java.net.URLClassLoader.defineClass(URLClassLoader.java:524) at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:427) at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:421) at java.base/java.security.AccessController.doPrivileged(AccessController.java:714) at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:420) at LoadAlternativeJVMCI$1.loadClass(LoadAlternativeJVMCI.java:61) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) at LoadAlternativeJVMCI.main(LoadAlternativeJVMCI.java:77) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1575) Test modification: diff --git a/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java b/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java index dd63867e7c2..28a6fedca38 100644 --- a/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java +++ b/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java @@ -51,7 +51,14 @@ public static void main(String[] args) throws Exception { } ClassLoader pcl = ClassLoader.getPlatformClassLoader(); - URLClassLoader ucl = new URLClassLoader(cp, null); + URLClassLoader ucl = new URLClassLoader(cp, null) { + protected Class loadClass(String name, boolean resolve) throws ClassNotFoundException { + if (!name.startsWith("jdk.vm.ci")) { + return super.loadClass(name, resolve); + } + return findClass(name); + } + }; String[] names = { "jdk.vm.ci.meta.ResolvedJavaType", ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1907671987 From shade at openjdk.org Wed Jan 24 08:58:30 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jan 2024 08:58:30 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 22:41:44 GMT, Quan Anh Mai wrote: >> src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 119: >> >>> 117: * @see #isCompileConstant(boolean) >>> 118: */ >>> 119: @IntrinsicCandidate >> >> Note how the Java entry for MH intrinsic we have replaced had `@Hidden`. These methods should have `@Hidden` too then? Probably applies to other entries too. > > I don't understand why this needs to be `@Hidden`, the javadoc says that a function annotated with `@Hidden` is omitted from the stacktraces. This function does not call anything so what is the point of hiding it? I suspect there is a code that counts stack traces somewhere that relies on it in MH parts. There is no harm for doing `@Hidden` here, I think. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464541674 From shade at openjdk.org Wed Jan 24 09:06:29 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jan 2024 09:06:29 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: <81tjQoutCZRej3wZAnPDJIq31hz7D7tbiJLWyWpXXv0=.5786bc11-7aa2-4290-a1d5-37c82452ed41@github.com> References: <81tjQoutCZRej3wZAnPDJIq31hz7D7tbiJLWyWpXXv0=.5786bc11-7aa2-4290-a1d5-37c82452ed41@github.com> Message-ID: On Tue, 23 Jan 2024 22:49:49 GMT, Quan Anh Mai wrote: >> src/java.base/share/classes/jdk/internal/misc/JitCompiler.java line 32: >> >>> 30: * Just-in-time-compiler-related queries >>> 31: */ >>> 32: public class JitCompiler { >> >> An alternative name and location is `jdk.internal.vm.ConstantSupport` with initial class doc: >> >> Defines methods to test if a value has been evaluated to a compile-time constant value by the HotSpot VM. > > That sounds like a better name for the class, although I think `jdk.internal.misc` is more suitable than `jdk.internal.vm`. Do you have any preference? Thanks. +1 to `ConstantSupport`. I think `jdk.internal.vm` is a proper place for it. There is adjacent `jdk.internal.vm.vector.VectorSupport`, and whole `jdk.internal.vm.annotations` package is there too. `jdk.internal.misc` sounds like a place for utility classes. `Unsafe` is a historical exception, I think. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464551793 From aph at openjdk.org Wed Jan 24 09:30:48 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 24 Jan 2024 09:30:48 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 22:57:09 GMT, Jiangli Zhou wrote: > > I found a way to hide the unwanted symbols in libjvm.a. This requires `ld --relocatable` and `objcopy --keep-global-symbols=...`. See the prototype here: > > > > * https://github.com/iklam/tools/tree/main/misc/staticlib > > > > So potentially we can do this completely in the makefiles, without adding namespaces to HotSpot. > > Yeah, `objcopy` can be used to localize symbols. One of my colleague @cjmoon1 implemented symbol localizing for `libfreetype.a` and `libharfbuzz.a` for static linking issue. In some cases, user might want to link with a different version of the harfbuzz library than the version linked with the JDK code. Then multiple versions of the libraries could be linked together into the executable. That was a solution suggested by C++ experts and it worked. Doing partial linking that produces a single `.o` file simplifies the work of `objcopy`. This is not a very portable solution though. OK, but it is the right thing to do on Linux. If some other operating systems don't provide useful tools, that's on them. I haven't checked, but I strongly suspect that LLVM can do it too, so all that remains is Windows, and maybe they can't have static linking (or maybe they have to use something like this PR) until the right tooling is provided. If Windows really can't do it, that's no reason to burden systems that can. Namespaces are not a low-cost solution for developers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1907728552 From aph at openjdk.org Wed Jan 24 09:31:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 24 Jan 2024 09:31:28 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. > > I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export. > > We also discussed about `objcopy` in [#14808 (comment)](https://github.com/openjdk/jdk/pull/14808#issuecomment-1631597197) and [#14808 (comment)](https://github.com/openjdk/jdk/pull/14808#issuecomment-1631611220). My main concern was the portability of `objcopy` approach. I replied: OK, but it is the right thing to do on Linux. If some other operating systems don't provide useful tools, that's on them. I haven't checked, but I strongly suspect that LLVM can do it too, so all that remains is Windows, and maybe they can't have static linking (or maybe they have to use something like this PR) until the right tooling is provided. If Windows really can't do it, that's no reason to burden systems that can. Namespaces are not a low-cost solution for developers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1907730900 From duke at openjdk.org Wed Jan 24 09:50:26 2024 From: duke at openjdk.org (Paul Woegerer) Date: Wed, 24 Jan 2024 09:50:26 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> Message-ID: On Tue, 23 Jan 2024 17:00:20 GMT, xxDark wrote: > There is zero reason to do this. Passing `null` as parent class loader would suffice as boot loader just uses `findBootstrapClassOrNull` in `JavaLangAccess` either way. Right, using `null` does the same thing. In the final version will use that instead of accessing private field `jdk.internal.loader.ClassLoaders#BOOT_LOADER`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1907772059 From ayang at openjdk.org Wed Jan 24 10:05:36 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 24 Jan 2024 10:05:36 GMT Subject: RFR: 8324512: Serial: Remove Generation::Name [v2] In-Reply-To: <81gkky88Q9P2sYNK2Rneht66RQ-7aXj4Bcc0gbTVqvo=.8687ed7d-1b81-411e-bb28-364c4518e6be@github.com> References: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> <81gkky88Q9P2sYNK2Rneht66RQ-7aXj4Bcc0gbTVqvo=.8687ed7d-1b81-411e-bb28-364c4518e6be@github.com> Message-ID: <_gjWC4zGGtFirBXsQqGp6Y8TfgYBWH_utdEQLKscEaI=.38e6841d-42c7-4ab9-bc0f-cf0fb5bde853@github.com> On Tue, 23 Jan 2024 18:37:45 GMT, Albert Mingkun Yang wrote: >> Trivial removing redundant code. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > year Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17530#issuecomment-1907796915 From ayang at openjdk.org Wed Jan 24 10:05:38 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 24 Jan 2024 10:05:38 GMT Subject: Integrated: 8324512: Serial: Remove Generation::Name In-Reply-To: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> References: <0TYkcTugwZN9ibPR85aJZyR47EWlIIBu6I1Z_LIjDcE=.8da28aec-687f-4b9b-97b4-a82b2f9311bd@github.com> Message-ID: On Tue, 23 Jan 2024 10:20:46 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. This pull request has now been integrated. Changeset: 1c1cb048 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/1c1cb048cd7820042373f5d8a9f41fb30d9cef6e Stats: 72 lines in 9 files changed: 0 ins; 64 del; 8 mod 8324512: Serial: Remove Generation::Name Reviewed-by: stefank, iwalulya, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/17530 From qamai at openjdk.org Wed Jan 24 10:33:05 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Wed, 24 Jan 2024 10:33:05 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: Message-ID: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. > > Please kindly give your opinion as well as your reviews, thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: address reviews ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17527/files - new: https://git.openjdk.org/jdk/pull/17527/files/3ecb2c66..b4445e2e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=04-05 Stats: 36 lines in 4 files changed: 10 ins; 0 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From qamai at openjdk.org Wed Jan 24 10:40:27 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Wed, 24 Jan 2024 10:40:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: <81tjQoutCZRej3wZAnPDJIq31hz7D7tbiJLWyWpXXv0=.5786bc11-7aa2-4290-a1d5-37c82452ed41@github.com> Message-ID: On Wed, 24 Jan 2024 09:03:43 GMT, Aleksey Shipilev wrote: >> That sounds like a better name for the class, although I think `jdk.internal.misc` is more suitable than `jdk.internal.vm`. Do you have any preference? Thanks. > > +1 to `ConstantSupport`. I think `jdk.internal.vm` is a proper place for it. There is adjacent `jdk.internal.vm.vector.VectorSupport`, and whole `jdk.internal.vm.annotations` package is there too. > > `jdk.internal.misc` sounds like a place for utility classes. `Unsafe` is a historical exception, I think. I see, my main premise is that it is somewhat similar to `Unsafe` which turns out to be an exception :) Thanks a lot for your suggestions, I have updated the PR, also added `@Hidden` back. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1464707689 From alanb at openjdk.org Wed Jan 24 10:49:28 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 24 Jan 2024 10:49:28 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Tue, 23 Jan 2024 19:16:49 GMT, Doug Simon wrote: >> This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use null to denote boot class loader as delegation parent test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java line 50: > 48: e = e + File.separator; > 49: } > 50: cp[i] = new URI("file:" + e).toURL(); This should be `cp[I] = file.toURI().toURL()` as a file path needs encoding to be URI path component. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17520#discussion_r1464719091 From eosterlund at openjdk.org Wed Jan 24 10:50:31 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 24 Jan 2024 10:50:31 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v24] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 11:14:41 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks (with implicit no-safepoint-verifiers) at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > cleanup unnecessary changes This looks very promising to me. Very nice with the asserts checking the appropriate locking is there when using the extra data section. Great job! ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16840#pullrequestreview-1841077500 From aboldtch at openjdk.org Wed Jan 24 11:29:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 11:29:29 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 10:52:06 GMT, Kim Barrett wrote: > Please review this change to the lazy initialization of the MemoryManager > object and the associated MemoryPool objects. > > They previously used an atomic access to the respective OopHandle member > holding the associated Java object as the is-initialized sentinal, testing > whether the handle was empty or had an associated OopStorage entry. When > empty, initialization was performed using a lock to prevent races. > > Now they use a separate atomic is-initialized flag as the sentinal. > > As a result, the support for atomic access to an OopHandle's underlying handle > (via a translator) is no longer needed and is removed. > > While there, I moved the allocation of the associated OopStorage entries out > from under the Management_lock. > > Testing: mach5 tier1 > > A couple of notes for reviewers. > > Once initialized with a Java object recorded in the associated OopHandle, the > OopHandle and the value recorded therein is never changed. > > The old is-initialized check makes use of OopHandle::resolve returning null if > either the handle is empty (has no OopStorage entry yet) or the OopStorage > entry contains null. The latter never happens in this case. Changes looks good to me. Just had a couple of questions/comments. src/hotspot/share/services/memoryManager.cpp line 147: > 145: } else { > 146: // Record the object we created via call_special. > 147: _memory_mgr_obj = mgr_handle; Could assert that `_memory_mgr_obj.is_empty()`. src/hotspot/share/services/memoryManager.hpp line 66: > 64: > 65: public: > 66: virtual ~MemoryManager() = default; // FIXME Was this added because we are deleting these objects through the base class pointer? Or just in case? Regardless what is the `FIXME` comment for? src/hotspot/share/services/memoryPool.cpp line 142: > 140: } else { > 141: // Record the object we created via call_special. > 142: _memory_pool_obj = pool_handle; Could assert that `_memory_pool_obj.is_empty()`. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17533#pullrequestreview-1841143156 PR Review Comment: https://git.openjdk.org/jdk/pull/17533#discussion_r1464773008 PR Review Comment: https://git.openjdk.org/jdk/pull/17533#discussion_r1464767168 PR Review Comment: https://git.openjdk.org/jdk/pull/17533#discussion_r1464767954 From rkennke at openjdk.org Wed Jan 24 11:34:32 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 Jan 2024 11:34:32 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Tue, 23 Jan 2024 16:30:48 GMT, Axel Boldt-Christmas wrote: >> Implements the aarch64 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Switch to CAS over LXSX > - Fix missing $ > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - 8319801: Recursive lightweight locking: aarch64 implementation > - Cleanup: C2 fast_lock/fast_unlock aarch64 Nice work! I've only got some minor issues. src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 230: > 228: Register t2, Register t3) { > 229: assert(LockingMode == LM_LIGHTWEIGHT, "must be"); > 230: // TODO: Current implementation does not use the box, consider removing. If it's not used, then please remove it? Maybe it can help to allocate one less register, which may be useful performance-wise when register pressure is high? src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 339: > 337: Register t2) { > 338: assert(LockingMode == LM_LIGHTWEIGHT, "must be"); > 339: // TODO: Current implementation uses box only as a TEMP, consider renaming. Yeah, please rename the register. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 68: > 66: #endif > 67: > 68: #include Is that import even used? I can't spot it. ------------- Changes requested by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16608#pullrequestreview-1841154137 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464774323 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464775022 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464776683 From dholmes at openjdk.org Wed Jan 24 11:49:26 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Jan 2024 11:49:26 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <_FCKvxLQxWZCfs-1Rxjr3qtcwMomMA368p_-6aJeTPQ=.db4d0b6f-205b-4127-b040-740e476e0919@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> <_FCKvxLQxWZCfs-1Rxjr3qtcwMomMA368p_-6aJeTPQ=.db4d0b6f-205b-4127-b040-740e476e0919@github.com> Message-ID: <7EhdI22NKmX21ef4jKqIsDLLjiMPUdt1nJvSHbo0i6g=.55e87451-c50f-417b-8b90-c3dff31e1d35@github.com> On Wed, 24 Jan 2024 08:46:08 GMT, Doug Simon wrote: >> test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java line 54: >> >>> 52: >>> 53: ClassLoader pcl = ClassLoader.getPlatformClassLoader(); >>> 54: URLClassLoader ucl = new URLClassLoader(cp, null); >> >> I am missing something here, a `URLClassLoader` first delegates to its parent before searching its URLs, so how does this not find the platform loader versions of the JVMCI classes? > > With `new URLClassLoader(cp, null)`, the URL loader delegates directly to the boot loader, by-passing the platform loader. Thanks Doug. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17520#discussion_r1464798827 From dholmes at openjdk.org Wed Jan 24 11:59:29 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Jan 2024 11:59:29 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Tue, 23 Jan 2024 19:16:49 GMT, Doug Simon wrote: >> This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use null to denote boot class loader as delegation parent Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17520#pullrequestreview-1841205744 From dholmes at openjdk.org Wed Jan 24 11:59:31 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Jan 2024 11:59:31 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Wed, 24 Jan 2024 08:56:10 GMT, Doug Simon wrote: > As far as I understand, even a non-delegating classloader cannot redefine a class loaded by the boot loader. I modified the test to show this and get: > > ``` > java.lang.LinkageError: loader LoadAlternativeJVMCI$1 @4a1f4d08 attempted duplicate class definition for jdk.vm.ci.meta.ResolvedJavaType. (jdk.vm.ci.meta.ResolvedJavaType is in unnamed module of loader LoadAlternativeJVMCI$1 @4a1f4d08, parent loader 'bootstrap') > at java.base/java.lang.ClassLoader.defineClass1(Native Method) Interesting. I'm not sure why that should be happening in this case. I can imagine a potential split-package issue with the bootloader that doesn't happen with the platform loader. I will look into it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1907983591 From rkennke at openjdk.org Wed Jan 24 12:12:29 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 Jan 2024 12:12:29 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Tue, 23 Jan 2024 16:30:48 GMT, Axel Boldt-Christmas wrote: >> Implements the aarch64 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Switch to CAS over LXSX > - Fix missing $ > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 > - 8319801: Recursive lightweight locking: aarch64 implementation > - Cleanup: C2 fast_lock/fast_unlock aarch64 One more comment. src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp line 81: > 79: tstw(hdr, JVM_ACC_IS_VALUE_BASED_CLASS); > 80: br(Assembler::NE, slow_case); > 81: } else if (LockingMode == LM_LIGHTWEIGHT) { What is the advantage of moving the load of the header around? The way you did it, it is less obvious that for LW the header is loaded in the block before locking is actually done. ------------- PR Review: https://git.openjdk.org/jdk/pull/16608#pullrequestreview-1841219625 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464817938 From duke at openjdk.org Wed Jan 24 12:19:27 2024 From: duke at openjdk.org (xxDark) Date: Wed, 24 Jan 2024 12:19:27 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Wed, 24 Jan 2024 08:56:10 GMT, Doug Simon wrote: > > I'm still puzzled by the need to do this as any non-delegating classloader would have allowed this even if JVMCI were loaded by the bootloader. > > As far as I understand, even a non-delegating classloader cannot redefine a class loaded by the boot loader. I modified the test to show this and get: > > ``` > java.lang.LinkageError: loader LoadAlternativeJVMCI$1 @4a1f4d08 attempted duplicate class definition for jdk.vm.ci.meta.ResolvedJavaType. (jdk.vm.ci.meta.ResolvedJavaType is in unnamed module of loader LoadAlternativeJVMCI$1 @4a1f4d08, parent loader 'bootstrap') > at java.base/java.lang.ClassLoader.defineClass1(Native Method) > at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1023) > at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) > at java.base/java.net.URLClassLoader.defineClass(URLClassLoader.java:524) > at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:427) > at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:421) > at java.base/java.security.AccessController.doPrivileged(AccessController.java:714) > at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:420) > at LoadAlternativeJVMCI$1.loadClass(LoadAlternativeJVMCI.java:61) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) > at LoadAlternativeJVMCI.main(LoadAlternativeJVMCI.java:77) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) > at java.base/java.lang.reflect.Method.invoke(Method.java:580) > at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) > at java.base/java.lang.Thread.run(Thread.java:1575) > ``` > > Test modification: > > ``` > diff --git a/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java b/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java > index dd63867e7c2..28a6fedca38 100644 > --- a/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java > +++ b/test/hotspot/jtreg/compiler/jvmci/LoadAlternativeJVMCI.java > @@ -51,7 +51,14 @@ public static void main(String[] args) throws Exception { > } > > ClassLoader pcl = ClassLoader.getPlatformClassLoader(); > - URLClassLoader ucl = new URLClassLoader(cp, null); > + URLClassLoader ucl = new URLClassLoader(cp, null) { > + protected Class loadClass(String name, boolean resolve) throws ClassNotFoundException { > + if (!name.startsWith("jdk.vm.ci")) { > + return super.loadClass(name, resolve); > + } > + return findClass(name); > + } > + }; > > String[] names = { > "jdk.vm.ci.meta.ResolvedJavaType", > ``` It can. You need to check if class is already loaded by trying `findLoadedClass` first. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1908013421 From rkennke at openjdk.org Wed Jan 24 12:35:33 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 Jan 2024 12:35:33 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Tue, 23 Jan 2024 16:24:48 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Add more expressive stub continuation names > - Remove outdated anonymous owner fix in stub > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Remove C2HandleAnonOMOwnerStub definitions on x86. > - Add MFENCE comment > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - ... and 8 more: https://git.openjdk.org/jdk/compare/196e4bcc...bc214b8d Great work! I've got a few questions/suggestions. src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 63: > 61: testl(hdr, JVM_ACC_IS_VALUE_BASED_CLASS); > 62: jcc(Assembler::notZero, slow_case); > 63: } else if (LockingMode == LM_LIGHTWEIGHT) { Same question as in the aarch64 version, why is it useful to move the header-load here? src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 144: > 142: lightweight_unlock(obj, disp_hdr, r15_thread, hdr, slow_case); > 143: #else > 144: // This relies on the implementation of lightweight_unlock knowing that it I wonder if is would be less brittle (fewer dependencies), if we didn't pass thread as register into lightweight_unlock() and keep the thread-loading and register-shuffling in that method? Same (perhaps) for lightweight_loc(). src/hotspot/cpu/x86/interp_masm_x86.cpp line 1314: > 1312: lightweight_unlock(obj_reg, swap_reg, r15_thread, header_reg, slow_case); > 1313: #else > 1314: // This relies on the implementation of lightweight_unlock knowing that it Same comment as above: consider moving get_thread() into lightweight_lock(). src/hotspot/cpu/x86/x86_64.ad line 12438: > 12436: format %{ "fastlock $object,$box\t! kills $box,$tmp,$scr" %} > 12437: ins_encode %{ > 12438: __ fast_lock_lightweight($object$$Register, $box$$Register, $tmp$$Register, $scr$$Register, r15_thread); It is slightly confusing that the names here don't match the naming in fast_lightweight_lock/unlock. You might want to fix that. ------------- Changes requested by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1841234557 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1464827095 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1464835513 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1464847518 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1464846481 From coleenp at openjdk.org Wed Jan 24 13:11:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 Jan 2024 13:11:30 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 10:52:06 GMT, Kim Barrett wrote: > Please review this change to the lazy initialization of the MemoryManager > object and the associated MemoryPool objects. > > They previously used an atomic access to the respective OopHandle member > holding the associated Java object as the is-initialized sentinal, testing > whether the handle was empty or had an associated OopStorage entry. When > empty, initialization was performed using a lock to prevent races. > > Now they use a separate atomic is-initialized flag as the sentinal. > > As a result, the support for atomic access to an OopHandle's underlying handle > (via a translator) is no longer needed and is removed. > > While there, I moved the allocation of the associated OopStorage entries out > from under the Management_lock. > > Testing: mach5 tier1 > > A couple of notes for reviewers. > > Once initialized with a Java object recorded in the associated OopHandle, the > OopHandle and the value recorded therein is never changed. > > The old is-initialized check makes use of OopHandle::resolve returning null if > either the handle is empty (has no OopStorage entry yet) or the OopStorage > entry contains null. The latter never happens in this case. This looks good to me and good to remove atomic OopHandle ops. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17533#pullrequestreview-1841336596 From rkennke at openjdk.org Wed Jan 24 13:13:32 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 Jan 2024 13:13:32 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v13] In-Reply-To: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> References: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> Message-ID: On Tue, 23 Jan 2024 16:14:53 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Fix miss in is_recursive improvement Great stuff - looks good to me! Thank you! /Roman ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16606#pullrequestreview-1841341578 From duke at openjdk.org Wed Jan 24 13:36:54 2024 From: duke at openjdk.org (kuaiwei) Date: Wed, 24 Jan 2024 13:36:54 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s kuaiwei has updated the pull request incrementally with one additional commit since the last revision: Add AlwaysMergeDMB option ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17511/files - new: https://git.openjdk.org/jdk/pull/17511/files/e4081bc9..056c1859 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=01-02 Stats: 7 lines in 3 files changed: 5 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17511.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17511/head:pull/17511 PR: https://git.openjdk.org/jdk/pull/17511 From dnsimon at openjdk.org Wed Jan 24 13:50:28 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Jan 2024 13:50:28 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: <5jXTeDsMH-p2MOufqzCMIdfk2QIjwbQoXOwrblDDJVQ=.80af7dfe-d3ae-4dd9-8cda-f5e55ed6474d@github.com> On Wed, 24 Jan 2024 12:16:44 GMT, xxDark wrote: > You need to check if class is already loaded by trying findLoadedClass first. You're right. I had forgotten the intricacies of class loader delegation. The only hard constraint on loading a class in multiple loaders is that `java.*` classes [must (only) be loaded by the boot loader](https://github.com/openjdk/jdk/blob/bccd823c8e40863bed70ff5b24772843203871a5/src/java.base/share/classes/java/lang/ClassLoader.java#L904). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1908157130 From ngasson at openjdk.org Wed Jan 24 13:51:32 2024 From: ngasson at openjdk.org (Nick Gasson) Date: Wed, 24 Jan 2024 13:51:32 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 13:36:54 GMT, kuaiwei wrote: >> Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. >> Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java >> Run with ParallelGC to minimalize impact of gc barrier. >> >> make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" >> ... >> FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s >> >> Without the patch >> >> FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s > > kuaiwei has updated the pull request incrementally with one additional commit since the last revision: > > Add AlwaysMergeDMB option src/hotspot/cpu/aarch64/globals_aarch64.hpp line 130: > 128: product(ccstr, UseBranchProtection, "none", \ > 129: "Branch Protection to use: none, standard, pac-ret") \ > 130: product(bool, AlwaysMergeDMB, false, \ It should really be a diagnostic option if we're going to have it configurable at all, as a product option needs a CSR and it's not something end users would want to fiddle with. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1464941090 From coleenp at openjdk.org Wed Jan 24 14:05:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 Jan 2024 14:05:34 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v24] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 11:14:41 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks (with implicit no-safepoint-verifiers) at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > cleanup unnecessary changes This looks better. I have a couple of small requests. Thanks. src/hotspot/share/oops/methodData.hpp line 2316: > 2314: uint arg_modified(int a) { // Lock and avoid breaking lock with Safepoint > 2315: MutexLocker ml(extra_data_lock(), Mutex::_no_safepoint_check_flag); > 2316: ArgInfoData *aid = arg_info(); Can you move these to methodData.inline.hpp so that you don't have to include mutexLocker.hpp in a header file? src/hotspot/share/oops/methodData.hpp line 2553: > 2551: "JavaThread must have NoSafepointVerifier inside lock scope"); > 2552: #endif > 2553: } Since this is not performance critical, can you move this to the .cpp file? ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16840#pullrequestreview-1841447172 PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1464957311 PR Review Comment: https://git.openjdk.org/jdk/pull/16840#discussion_r1464958698 From aboldtch at openjdk.org Wed Jan 24 14:14:58 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 14:14:58 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v7] In-Reply-To: References: Message-ID: <2iqNAJPsvhtiUUmru0Qu-c3EsiOAA25qzJ4LKI5K1AY=.5a73859d-d738-4a7a-af61-9ac9ecd70352@github.com> > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Rename box to t1 - Remove third tmp from fast_lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/8a7ebd0f..9b00569f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=05-06 Stats: 18 lines in 3 files changed: 0 ins; 2 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From aboldtch at openjdk.org Wed Jan 24 14:15:00 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 14:15:00 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Wed, 24 Jan 2024 11:27:25 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Switch to CAS over LXSX >> - Fix missing $ >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - 8319801: Recursive lightweight locking: aarch64 implementation >> - Cleanup: C2 fast_lock/fast_unlock aarch64 > > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 230: > >> 228: Register t2, Register t3) { >> 229: assert(LockingMode == LM_LIGHTWEIGHT, "must be"); >> 230: // TODO: Current implementation does not use the box, consider removing. > > If it's not used, then please remove it? Maybe it can help to allocate one less register, which may be useful performance-wise when register pressure is high? Done. > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 339: > >> 337: Register t2) { >> 338: assert(LockingMode == LM_LIGHTWEIGHT, "must be"); >> 339: // TODO: Current implementation uses box only as a TEMP, consider renaming. > > Yeah, please rename the register. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464972254 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464972382 From aboldtch at openjdk.org Wed Jan 24 14:19:31 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 14:19:31 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Wed, 24 Jan 2024 11:29:41 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Switch to CAS over LXSX >> - Fix missing $ >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - 8319801: Recursive lightweight locking: aarch64 implementation >> - Cleanup: C2 fast_lock/fast_unlock aarch64 > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 68: > >> 66: #endif >> 67: >> 68: #include > > Is that import even used? I can't spot it. This was simply to correct the include order of the already included includes. `#include ` appeared above the precompiled header include. The rule is to have system includes below the ordinary includes with one new empty line in-between. Might very well be that this include is not used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464978772 From aboldtch at openjdk.org Wed Jan 24 14:33:32 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 14:33:32 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Wed, 24 Jan 2024 12:05:04 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Switch to CAS over LXSX >> - Fix missing $ >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 >> - 8319801: Recursive lightweight locking: aarch64 implementation >> - Cleanup: C2 fast_lock/fast_unlock aarch64 > > src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp line 81: > >> 79: tstw(hdr, JVM_ACC_IS_VALUE_BASED_CLASS); >> 80: br(Assembler::NE, slow_case); >> 81: } else if (LockingMode == LM_LIGHTWEIGHT) { > > What is the advantage of moving the load of the header around? The way you did it, it is less obvious that for LW the header is loaded in the block before locking is actually done. The current implementation requires the first emitted instruction of `C1_MacroAssembler::lock_object` to trap if the obj is null. We do it here with a load. I could instrument `MacroAssembler::lightweight_lock` with a `bool preload_mark` so that C1 does not do the redundant load, while still emitting its first trapping instruction, and not create redundant loads for the interpreter and native wrappers. I think that we want to move all platforms to what PPC is doing, mainly that C2, C1 and native wrapper shares a common lock/unlock implementation (as they lock + unlock under the same balanced assumptions), while have `MacroAssembler::lightweight_lock` solely dedicated to the interpreter. I think I will update `MacroAssembler::lightweight_lock` to have this `preload_mark` bool. You are not the first to ask about this specific line, and it might make it more clear. With a comment about the C1 implicit null check. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1464999437 From mdoerr at openjdk.org Wed Jan 24 14:23:27 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 24 Jan 2024 14:23:27 GMT Subject: RFR: 8315762: Update subtype check profile collection on s390x following 8308869 In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 12:06:13 GMT, Amit Kumar wrote: > s390x Implementation for https://github.com/openjdk/jdk/pull/14375 > > Benchmark Result with patch: > > Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units > RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1155.409 ? 43.844 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 726.923 ? 54.536 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 676.462 ? 23.503 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 118.650 ? 2.653 ops/us > > > Without Patch: > > Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units > RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1101.248 ? 103.559 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 109.690 ? 3.312 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 110.790 ? 7.927 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 112.244 ? 6.889 ops/us > > > Testing : Fastdebug build + tier1 tests Looks correct AFAICS. Please make sure it is thoroughly tested before integrating. Especially the C1 code is sensitive to register clashes which may cause errors which are very hard to debug. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17461#pullrequestreview-1841493258 From aboldtch at openjdk.org Wed Jan 24 14:43:51 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 14:43:51 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v8] In-Reply-To: References: Message-ID: > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Add preload_mark to MacroAssembler::lightweight_lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/9b00569f..8950f503 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=06-07 Stats: 22 lines in 5 files changed: 13 ins; 3 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From rkennke at openjdk.org Wed Jan 24 14:57:28 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 Jan 2024 14:57:28 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Wed, 24 Jan 2024 14:30:46 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp line 81: >> >>> 79: tstw(hdr, JVM_ACC_IS_VALUE_BASED_CLASS); >>> 80: br(Assembler::NE, slow_case); >>> 81: } else if (LockingMode == LM_LIGHTWEIGHT) { >> >> What is the advantage of moving the load of the header around? The way you did it, it is less obvious that for LW the header is loaded in the block before locking is actually done. > > The current implementation requires the first emitted instruction of `C1_MacroAssembler::lock_object` to trap if the obj is null. We do it here with a load. > > I could instrument `MacroAssembler::lightweight_lock` with a `bool preload_mark` so that C1 does not do the redundant load, while still emitting its first trapping instruction, and not create redundant loads for the interpreter and native wrappers. > > I think that we want to move all platforms to what PPC is doing, mainly that C2, C1 and native wrapper shares a common lock/unlock implementation (as they lock + unlock under the same balanced assumptions), while have `MacroAssembler::lightweight_lock` solely dedicated to the interpreter. > > I think I will update `MacroAssembler::lightweight_lock` to have this `preload_mark` bool. You are not the first to ask about this specific line, and it might make it more clear. With a comment about the C1 implicit null check. Ok I understand, thank you! I actually like it more the way you did it before (without the bool preload_mark flag). An extra comment there may be useful. Or maybe consider to load the mark in MacroAssembler as first instruction unconditionally? As far as I can tell mark-word is not used in the lock-stack-full case (very uncommon) and recursive (more common), but it's only really relevant for interpreter and shared-runtime (C2 has its own impl, C1 needs to have the load there anyway), where the extra cycle don't really matter anyway? Would keep the code cleaner. But it means there is an implicit behavioural dependency between the locking impl and the C1 impl, which might be problematic. But let's not add bools and different code-paths, this is even more confusing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465036011 From kbarrett at openjdk.org Wed Jan 24 15:00:28 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 24 Jan 2024 15:00:28 GMT Subject: RFR: 8324301: Obsolete MaxGCMinorPauseMillis In-Reply-To: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> References: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> Message-ID: On Mon, 22 Jan 2024 11:33:14 GMT, Albert Mingkun Yang wrote: > Simple obsoleting a deprecated jvm flag. Looks good. This will probably also warrant a release note. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17517#pullrequestreview-1841590632 PR Comment: https://git.openjdk.org/jdk/pull/17517#issuecomment-1908302317 From jiefu at openjdk.org Wed Jan 24 15:27:42 2024 From: jiefu at openjdk.org (Jie Fu) Date: Wed, 24 Jan 2024 15:27:42 GMT Subject: RFR: 8323515: Create test alias "all" for all test roots [v3] In-Reply-To: References: <9g7evWB6t3A8WAugPwgIP1gyisNBd1pGT9yFoC_0Z8M=.95b0574e-d163-4911-9c79-b58bf7301f7a@github.com> Message-ID: On Tue, 23 Jan 2024 17:03:13 GMT, Aleksey Shipilev wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Catch-all -> All tests > > Thank you all! Hi @shipilev , plese see https://github.com/openjdk/jdk/pull/17558 . Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17422#issuecomment-1908355476 From alanb at openjdk.org Wed Jan 24 15:37:29 2024 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 24 Jan 2024 15:37:29 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <5jXTeDsMH-p2MOufqzCMIdfk2QIjwbQoXOwrblDDJVQ=.80af7dfe-d3ae-4dd9-8cda-f5e55ed6474d@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> <5jXTeDsMH-p2MOufqzCMIdfk2QIjwbQoXOwrblDDJVQ=.80af7dfe-d3ae-4dd9-8cda-f5e55ed6474d@github.com> Message-ID: <5OWCQ7TabDzC57jr8P_7GBzRHcfX6ZEmXF4LK-R7k3c=.c7c9fbb3-0a7d-4773-8415-64689afeb6a1@github.com> On Wed, 24 Jan 2024 13:47:17 GMT, Doug Simon wrote: > You're right. I had forgotten the intricacies of class loader delegation. The only hard constraint on loading a class in multiple loaders is that `java.*` classes [must (only) be loaded by the boot loader](https://github.com/openjdk/jdk/blob/bccd823c8e40863bed70ff5b24772843203871a5/src/java.base/share/classes/java/lang/ClassLoader.java#L904). Just to add that this restriction was relaxed in Java 9 to allow java.* classes be defined by the platform class loader. The code that is linked to here throws if the class loader is not the platform class loader. There isn't a user accessible ClassLoader object for the boot loader and testing `this == null` doesn't make sense. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1908375220 From coleenp at openjdk.org Wed Jan 24 15:52:32 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 Jan 2024 15:52:32 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Wed, 24 Jan 2024 14:54:35 GMT, Roman Kennke wrote: >> The current implementation requires the first emitted instruction of `C1_MacroAssembler::lock_object` to trap if the obj is null. We do it here with a load. >> >> I could instrument `MacroAssembler::lightweight_lock` with a `bool preload_mark` so that C1 does not do the redundant load, while still emitting its first trapping instruction, and not create redundant loads for the interpreter and native wrappers. >> >> I think that we want to move all platforms to what PPC is doing, mainly that C2, C1 and native wrapper shares a common lock/unlock implementation (as they lock + unlock under the same balanced assumptions), while have `MacroAssembler::lightweight_lock` solely dedicated to the interpreter. >> >> I think I will update `MacroAssembler::lightweight_lock` to have this `preload_mark` bool. You are not the first to ask about this specific line, and it might make it more clear. With a comment about the C1 implicit null check. > > Ok I understand, thank you! I actually like it more the way you did it before (without the bool preload_mark flag). An extra comment there may be useful. > Or maybe consider to load the mark in MacroAssembler as first instruction unconditionally? As far as I can tell mark-word is not used in the lock-stack-full case (very uncommon) and recursive (more common), but it's only really relevant for interpreter and shared-runtime (C2 has its own impl, C1 needs to have the load there anyway), where the extra cycle don't really matter anyway? Would keep the code cleaner. But it means there is an implicit behavioural dependency between the locking impl and the C1 impl, which might be problematic. > But let's not add bools and different code-paths, this is even more confusing. Me too, I prefer not having the extra bool passed around. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465135120 From aboldtch at openjdk.org Wed Jan 24 15:59:31 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 15:59:31 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v6] In-Reply-To: References: <9LZ6Hco14UBF8NhBetiaaNmuwHfX7VZE_8cVgCttcVk=.59938326-95c3-4cbf-9afe-182e6c045e87@github.com> Message-ID: On Wed, 24 Jan 2024 15:49:57 GMT, Coleen Phillimore wrote: >> Ok I understand, thank you! I actually like it more the way you did it before (without the bool preload_mark flag). An extra comment there may be useful. >> Or maybe consider to load the mark in MacroAssembler as first instruction unconditionally? As far as I can tell mark-word is not used in the lock-stack-full case (very uncommon) and recursive (more common), but it's only really relevant for interpreter and shared-runtime (C2 has its own impl, C1 needs to have the load there anyway), where the extra cycle don't really matter anyway? Would keep the code cleaner. But it means there is an implicit behavioural dependency between the locking impl and the C1 impl, which might be problematic. >> But let's not add bools and different code-paths, this is even more confusing. > > Me too, I prefer not having the extra bool passed around. I'll add it unconditionally then. That was my initial solution to fix the C1 issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465146712 From aboldtch at openjdk.org Wed Jan 24 16:06:49 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 Jan 2024 16:06:49 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Preloads markWord unconditionally - Revert "Add preload_mark to MacroAssembler::lightweight_lock" This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/8950f503..7d2584e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=07-08 Stats: 18 lines in 5 files changed: 0 ins; 10 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From ihse at openjdk.org Wed Jan 24 16:01:32 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 24 Jan 2024 16:01:32 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: <8Q2xMTEBHQGxSa-bA_ChMBt_cmx9aQ93QjtrPHdYabg=.ef6cfc53-a38b-4834-a5a1-2f9a0f6a0d7c@github.com> On Tue, 23 Jan 2024 19:16:49 GMT, Doug Simon wrote: >> This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use null to denote boot class loader as delegation parent Build changes are trivially fine ------------- Marked as reviewed by ihse (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17520#pullrequestreview-1841751451 From dnsimon at openjdk.org Wed Jan 24 16:37:41 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Jan 2024 16:37:41 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> Message-ID: On Tue, 23 Jan 2024 19:16:49 GMT, Doug Simon wrote: >> This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use null to denote boot class loader as delegation parent I'm closing this PR as the Native Image use case https://bugs.openjdk.org/browse/JDK-8323832 was opened for can be solved with an appropriately crafted custom loader that does not delegate loading of JVMCI classes. Thanks for the reviews anyway, especially @xxDark for highlighting that this change is unnecessary. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1908498530 From dnsimon at openjdk.org Wed Jan 24 16:37:43 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Jan 2024 16:37:43 GMT Subject: Withdrawn: 8323832: Load JVMCI with the platform class loader In-Reply-To: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> Message-ID: On Mon, 22 Jan 2024 17:34:16 GMT, Doug Simon wrote: > This PR changes `jdk.internal.vm.ci` such that it is loaded by the platform class loader instead of the boot class loader. This allows Native Image to load a version of JVMCI different than the version on top of which Native Image is running. This capability is demonstrated and tested by `LoadAlternativeJVMCI.java`. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/17520 From aph at openjdk.org Wed Jan 24 17:08:30 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 24 Jan 2024 17:08:30 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 13:48:21 GMT, Nick Gasson wrote: >> kuaiwei has updated the pull request incrementally with one additional commit since the last revision: >> >> Add AlwaysMergeDMB option > > src/hotspot/cpu/aarch64/globals_aarch64.hpp line 130: > >> 128: product(ccstr, UseBranchProtection, "none", \ >> 129: "Branch Protection to use: none, standard, pac-ret") \ >> 130: product(bool, AlwaysMergeDMB, false, \ > > It should really be a diagnostic option if we're going to have it configurable at all, as a product option needs a CSR and it's not something end users would want to fiddle with. Exactly so, yes. I think we should simply stop merging DMBs where the end result of doing so is to strengthen the barrier beyond what was requested. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17511#discussion_r1465279207 From eosterlund at openjdk.org Wed Jan 24 17:31:28 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 24 Jan 2024 17:31:28 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 14:47:49 GMT, Thomas Schatzl wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Fwiw, the change makes class unloading regress significantly in a class unloading stress test (unloading 60k classes), seemingly tripling the time it takes for the "Purge Unlinked NMethods" phase (~20ms -> ~60ms). > > This may not be a problem for the concurrent gcs, but can be for the STW ones. > > (Overall max G1 remark pause times went from 160ms to 220ms, regular Remark pauses which do class unloading from 120ms to 160ms). @tschatzl what program did you run, so I can reproduce? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1908602132 From coleenp at openjdk.org Wed Jan 24 17:39:31 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 Jan 2024 17:39:31 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 16:06:49 GMT, Axel Boldt-Christmas wrote: >> Implements the aarch64 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Preloads markWord unconditionally > - Revert "Add preload_mark to MacroAssembler::lightweight_lock" > > This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. I have some minor questions and random thoughts, but this looks good. src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 381: > 379: assert(oopDesc::mark_offset_in_bytes() == 0, "required to avoid lea"); > 380: orr(t, mark, markWord::unlocked_value); > 381: // Release to satisfy the JMM. I don't know what this comment is trying to say. in the cmpxchg, is 't' the address of the markWord? src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 434: > 432: sub(recursions, recursions, 1u); > 433: str(recursions, Address(monitor, ObjectMonitor::recursions_offset())); > 434: // Set flag == EQ Thanks for this comment. src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 455: > 453: // The owner may be anonymous and we removed the last obj entry in > 454: // the lock-stack. This loses the information about the owner. > 455: // Write the thread to the owner field so the runtime knows the owner. Is this necessary here also? Previous checks and slow path code in the runtime has already set the owner, if I understand correctly. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16608#pullrequestreview-1841976962 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465304035 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465306973 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465311272 From cslucas at openjdk.org Wed Jan 24 17:46:46 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 24 Jan 2024 17:46:46 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v8] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > # Tier-1 Testing status > > | | Win | Mac | Linux | > |----------|---------|---------|---------| > | ARM64 | ? | ? | | > | ARM32 | n/a | n/a | | > | x86 | | | ? | > | x64 | ? | ? | ? | > | PPC64 | n/a | n/a | | > | S390x | n/a | n/a | | > | RiscV | n/a | n/a | ? | Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Catching up with origin/master - Catch up with origin/master - Merge with origin/master - Fix build, copyright dates, m4 files. - Fix merge - Catch up with master branch. Merge remote-tracking branch 'origin/master' into reuse-macroasm - Some inst_mark fixes; Catch up with master. - Catch up with changes on master - Reuse same C2_MacroAssembler object to emit instructions. ------------- Changes: https://git.openjdk.org/jdk/pull/16484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=07 Stats: 2445 lines in 61 files changed: 106 ins; 434 del; 1905 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From tschatzl at openjdk.org Wed Jan 24 18:23:30 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 24 Jan 2024 18:23:30 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: <__1cCuz0pSPmQyeB9FU61YZy8kNfm3rorJpk3ySmpZg=.7e796e62-2fe2-4ef8-997a-b06b032dbae2@github.com> On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. I think the `ccstress` application attached to https://bugs.openjdk.org/browse/JDK-8315503 should show the issue (not tested - today I've been busy completing a JDK 22 bugfix). The actual application I'm using is specjbb2015 that basically executes that ccstress application as a java agent in parallel to jbb2015 just for extra load. I can certainly give you that. Tomorrow I will also spend some time investigating this change a bit more (also on aarch64 - I think other factors may easily outweigh this difference), but I think that @TheRealMDoerr is correct about the C heap allocator just being very slow. Glibc being slow is a "known" issue, other `delete` calls in class unloading already take a significant chunk of that phase, so there is already some interest (from me) to do something about them. There are ideas how to handle this, one of them being moving this (and other) `delete` calls into some concurrent phase somehow (in addition to making metaspace purging concurrent). That would obviously only help G1 though, so maybe there is some better option. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1908684867 From eosterlund at openjdk.org Wed Jan 24 18:32:28 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 24 Jan 2024 18:32:28 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Thank you Thomas for having a look. I think that fixing the issue in G1 alone takes us pretty far. If you really care about latency in your non-small app that unloads tens of thousands of classes at a time, then it seems just plain weird to sit there with Serial or Parallel complaining about latencies. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1908697604 From mcimadamore at openjdk.org Wed Jan 24 18:51:29 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 24 Jan 2024 18:51:29 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: Message-ID: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> On Wed, 24 Jan 2024 10:33:05 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > address reviews Naive question: the right way to use this would be almost invariably be like this: if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { // fast-path } // slow path Right? Then the expectation is that during interpreter and C1, `isCompileConstant` always returns false, so we just never take the fast path (but we probably still pay for the branch, right?). And, when we get to C2 and this method is inlined, at this point we know that either `foo` is constant or not. If it is constant we can check other conditions on foo (which presumably is cheap because `foo` is constant) and maybe take the fast-path. In both cases, there's no branch in the generated code because we know "statically" when inlining if `foo` has the right shape or not. Correct? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1908724632 From shade at openjdk.org Wed Jan 24 18:51:31 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jan 2024 18:51:31 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: <9iDFu8I4w_i1Uso5q7oEi0Le1JvgDNgNyuSZlmKQiuE=.5739d448-fc73-4bcf-bec8-26b3a1b75d21@github.com> References: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> <9iDFu8I4w_i1Uso5q7oEi0Le1JvgDNgNyuSZlmKQiuE=.5739d448-fc73-4bcf-bec8-26b3a1b75d21@github.com> Message-ID: On Wed, 24 Jan 2024 07:15:12 GMT, Quan Anh Mai wrote: >> This seems really weird to me for Java code. The method doesn't get the original "expression" it only gets the value of that expression after it has been evaluated. Is there some kind of weird "magic" happening here? > > @dholmes-ora Indeed it's a compiler magic, albeit not really weird. While the method execution only receives the evaluated value of `expr`, the method compilation has the expression in its original form. As a result, it can determine the result based on this information. It is still weird to talk about expressions at this level. We really check if the value is constant, like the method name suggests now. Yes, this implicitly tests that the expression that produced that value is fully constant-folded. But that's a detail that we do not need to capture here. Let's rename `expr` -> `val`, and tighten up the javadoc for the method to mention we only test the constness of the final value. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1465401456 From shade at openjdk.org Wed Jan 24 18:51:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jan 2024 18:51:32 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 10:33:05 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > address reviews src/java.base/share/classes/jdk/internal/vm/ConstantSupport.java line 32: > 30: /** > 31: * Just-in-time-compiler-related queries > 32: */ This looks like a stale comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1465397036 From mcimadamore at openjdk.org Wed Jan 24 18:54:28 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 24 Jan 2024 18:54:28 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Wed, 24 Jan 2024 18:48:03 GMT, Maurizio Cimadamore wrote: > Naive question: the right way to use this would be almost invariably be like this: > > ``` > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { > // fast-path > } > // slow path > ``` > > Right? Then the expectation is that during interpreter and C1, `isCompileConstant` always returns false, so we just never take the fast path (but we probably still pay for the branch, right?). And, when we get to C2 and this method is inlined, at this point we know that either `foo` is constant or not. If it is constant we can check other conditions on foo (which presumably is cheap because `foo` is constant) and maybe take the fast-path. In both cases, there's no branch in the generated code because we know "statically" when inlining if `foo` has the right shape or not. Correct? P.S. if this is correct, please consider adding something along those lines in the javadoc of `isCompileConstant`; as it stands it is a bit obscure to understand how this thing might be used, and what are the common pitfalls to avoid when using it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1908729766 From shade at openjdk.org Wed Jan 24 18:58:28 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jan 2024 18:58:28 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Wed, 24 Jan 2024 18:51:27 GMT, Maurizio Cimadamore wrote: > Naive question: the right way to use this would be almost invariably be like this: > > ``` > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { > // fast-path > } > // slow path > ``` > > Right? Yes, I think so. > Then the expectation is that during interpreter and C1, `isCompileConstant` always returns false, so we just never take the fast path (but we probably still pay for the branch, right?). Yes, I think so. For C1, we would still prune the "dead" path, because C1 is able to know that `if (false)` is never taken. We do pay with the branch and the method call in interpreter. (There are ways to special-case these intrinsics for interpreter too, if we choose to care.) > And, when we get to C2 and this method is inlined, at this point we know that either `foo` is constant or not. If it is constant we can check other conditions on foo (which presumably is cheap because `foo` is constant) and maybe take the fast-path. In both cases, there's no branch in the generated code because we know "statically" when inlining if `foo` has the right shape or not. Correct? Yes. I think the major use would be using `constexpr`-like code on "const" path, so that the entire "const" branch constant-folds completely. In [my experiments](https://github.com/openjdk/jdk/pull/17527#issuecomment-1906379544) with `Integer.toString` that certainly happens. But that is not a requirement, and we could probably still reap some benefits from partial constant folds; but at that point we would need to prove that a "partially const" path is better than generic "non-const" path under the same conditions. I agree it would be convenient to put some examples in javadoc. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1908736651 From psandoz at openjdk.org Wed Jan 24 19:40:27 2024 From: psandoz at openjdk.org (Paul Sandoz) Date: Wed, 24 Jan 2024 19:40:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> <9iDFu8I4w_i1Uso5q7oEi0Le1JvgDNgNyuSZlmKQiuE=.5739d448-fc73-4bcf-bec8-26b3a1b75d21@github.com> Message-ID: <-msFouQp2kpWPf6LTKgbDAeLPUkfET6wVesLbAz-6T4=.54ca377c-2e49-4229-a060-daa34485eead@github.com> On Wed, 24 Jan 2024 18:48:34 GMT, Aleksey Shipilev wrote: >> @dholmes-ora Indeed it's a compiler magic, albeit not really weird. While the method execution only receives the evaluated value of `expr`, the method compilation has the expression in its original form. As a result, it can determine the result based on this information. > > It is still weird to talk about expressions at this level. We really check if the value is constant, like the method name suggests now. Yes, this implicitly tests that the expression that produced that value is fully constant-folded. But that's a detail that we do not need to capture here. Let's rename `expr` -> `val`, and tighten up the javadoc for the method to mention we only test the constness of the final value. I agree. All values are produced by evaluating expressions. In this case we want to query whether a value produced by the compiler evaluating its expression is a constant value (inputs to the expression are constants and the expression had no material side-effects). Meaning if the method returns true then we could use that knowledge in subsequent expressions that may also produce constants or some specific behavior. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1465449454 From never at openjdk.org Wed Jan 24 19:57:33 2024 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 24 Jan 2024 19:57:33 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v24] In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 11:14:41 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks (with implicit no-safepoint-verifiers) at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > cleanup unnecessary changes This looks good to me. ------------- Marked as reviewed by never (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16840#pullrequestreview-1842250657 From lmesnik at openjdk.org Wed Jan 24 21:31:24 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 24 Jan 2024 21:31:24 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: <-quimoVBzziosvz8NKyrp0fop7T9ZRDg4SJo1wif6aw=.77948f6f-0728-4e57-ae98-5c5472f435ee@github.com> On Mon, 15 Jan 2024 10:48:23 GMT, Aleksey Shipilev wrote: > Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. > > I provisionally call this flag `external-dep`, but I am open for other suggestions. > > Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. > > Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. > > Additional testing: > - [x] `make test TEST=applications/` fails > - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17421#pullrequestreview-1842419609 From dholmes at openjdk.org Wed Jan 24 21:43:35 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Jan 2024 21:43:35 GMT Subject: RFR: 8323832: Load JVMCI with the platform class loader [v2] In-Reply-To: <5jXTeDsMH-p2MOufqzCMIdfk2QIjwbQoXOwrblDDJVQ=.80af7dfe-d3ae-4dd9-8cda-f5e55ed6474d@github.com> References: <6mck217Pb518xgiP3lwwpFsM7cIG848O0wd32BqWt6s=.22665c42-b3be-4ebf-b48c-033ecdbf50e9@github.com> <3ssi94PwDDJSjpoDwCyVTvPEicBzNYMtTFZQLPPS8X4=.fe74293a-224d-47f8-a9c9-0bb2206a817b@github.com> <5jXTeDsMH-p2MOufqzCMIdfk2QIjwbQoXOwrblDDJVQ=.80af7dfe-d3ae-4dd9-8cda-f5e55ed6474d@github.com> Message-ID: On Wed, 24 Jan 2024 13:47:17 GMT, Doug Simon wrote: > You need to check if class is already loaded by trying findLoadedClass first. Thanks @xxDark . I knew it should work. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17520#issuecomment-1908961416 From jiangli at openjdk.org Wed Jan 24 22:33:28 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 24 Jan 2024 22:33:28 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 09:29:16 GMT, Andrew Haley wrote: > > > I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export. > > > > > > We also discussed about `objcopy` in [#14808 (comment)](https://github.com/openjdk/jdk/pull/14808#issuecomment-1631597197) and [#14808 (comment)](https://github.com/openjdk/jdk/pull/14808#issuecomment-1631611220). My main concern was the portability of `objcopy` approach. > > I replied: > > OK, but it is the right thing to do on Linux. If some other operating systems don't provide useful tools, that's on them. I haven't checked, but I strongly suspect that LLVM can do it too, so all that remains is Windows, and maybe they can't have static linking (or maybe they have to use something like this PR) until the right tooling is provided. > > If Windows really can't do it, that's no reason to burden systems that can. Namespaces are not a low-cost solution for developers. Thanks, @theRealAph. Yeah, I was mainly concerned about non-unix like systems, Windows particularly. It might not work on all potentially supported compilers (`gcc`) on linux, however. To localizing symbols in `libjvm` using `objcopy`, we can first partially link (with `-r`) all hotspot `.o` into a single object file, then run `objcopy` for the output object file to localize the affected symbols. The partial linking work (https://github.com/openjdk/jdk/blob/2003610b3b52eed04de6713a2a36151d0d86d7c9/make/common/NativeCompilation.gmk#L1245) has been added already. However, during the https://github.com/openjdk/jdk/pull/14064 work, we ran into issues with partial linking on older `gcc` for linux-aarch64. The details were captured in https://github.com/openjdk/jdk/pull/14064#issuecomment-1564908324 discussion with @erikj79. Only `clang` currently work well with the partial linking and symbol localizing solution. Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like `objcopy` can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1909019550 From jiangli at openjdk.org Wed Jan 24 22:35:43 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 24 Jan 2024 22:35:43 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 09:27:54 GMT, Andrew Haley wrote: > > I found a way to hide the unwanted symbols in libjvm.a. This requires `ld --relocatable` and `objcopy --keep-global-symbols=...`. See the prototype here: > > > > * https://github.com/iklam/tools/tree/main/misc/staticlib > > > > So potentially we can do this completely in the makefiles, without adding namespaces to HotSpot. > > Yeah, `objcopy` can be used to localize symbols. One of my colleague @cjmoon1 implemented symbol localizing for `libfreetype.a` and `libharfbuzz.a` for static linking issue. In some cases, user might want to link with a different version of the harfbuzz library than the version linked with the JDK code. Then multiple versions of the libraries could be linked together into the executable. That was a solution suggested by C++ experts and it worked. Doing partial linking that produces a single `.o` file simplifies the work of `objcopy`. This is not a very portable solution though. Additional discussions in https://github.com/openjdk/jdk/pull/17456#issuecomment-1909019550 thread ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1909021515 From qamai at openjdk.org Thu Jan 25 03:13:27 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 25 Jan 2024 03:13:27 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Wed, 24 Jan 2024 18:56:15 GMT, Aleksey Shipilev wrote: >>> Naive question: the right way to use this would be almost invariably be like this: >>> >>> ``` >>> if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { >>> // fast-path >>> } >>> // slow path >>> ``` >>> >>> Right? Then the expectation is that during interpreter and C1, `isCompileConstant` always returns false, so we just never take the fast path (but we probably still pay for the branch, right?). And, when we get to C2 and this method is inlined, at this point we know that either `foo` is constant or not. If it is constant we can check other conditions on foo (which presumably is cheap because `foo` is constant) and maybe take the fast-path. In both cases, there's no branch in the generated code because we know "statically" when inlining if `foo` has the right shape or not. Correct? >> >> P.S. if this is correct, please consider adding something along those lines in the javadoc of `isCompileConstant`; as it stands it is a bit obscure to understand how this thing might be used, and what are the common pitfalls to avoid when using it. > >> Naive question: the right way to use this would be almost invariably be like this: >> >> ``` >> if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { >> // fast-path >> } >> // slow path >> ``` >> >> Right? > > Yes, I think so. > >> Then the expectation is that during interpreter and C1, `isCompileConstant` always returns false, so we just never take the fast path (but we probably still pay for the branch, right?). > > Yes, I think so. For C1, we would still prune the "dead" path, because C1 is able to know that `if (false)` is never taken. We do pay with the branch and the method call in interpreter. (There are ways to special-case these intrinsics for interpreter too, if we choose to care.) > >> And, when we get to C2 and this method is inlined, at this point we know that either `foo` is constant or not. If it is constant we can check other conditions on foo (which presumably is cheap because `foo` is constant) and maybe take the fast-path. In both cases, there's no branch in the generated code because we know "statically" when inlining if `foo` has the right shape or not. Correct? > > Yes. I think the major use would be using `constexpr`-like code on "const" path, so that the entire code constant-folds completely, _or_ just compiles to branch-less "generic" version. In [my experiments](https://github.com/openjdk/jdk/pull/17527#issuecomment-1906379544) with `Integer.toString` that certainly happens. But that is not a requirement, and we could probably still reap some benefits from partial constant folds; but at that point we would need to prove that a "partially const" path is better than generic "non-const" path under the same conditions. > > I agree it would be convenient to put some examples in javadoc. @merykitty, I can help you with that, if you want. @shipilev I can come up with 2 examples that are pretty generic: void checkIndex(int index, int length) { boolean indexPositive = index >= 0; if (ConstantSupport.isCompileConstant(indexPositive) && indexPositive) { if (index >= length) { throw; } return; } if (length < 0 || Integer.compareUnsigned(index, length) >= 0) { throw; } } bool equals(Point p1, Point p2) { idEqual = p1 == p2; if (ConstantSupport.isCompileConstant(idEqual) && idEqual) { return true; } return p1.x == p2.x && p1.y == p2.y; } @mcimadamore Yes I believe your expectations are correct. Pitfalls may vary case-by-case, but I just realised that since we do not have profile information in the fast path, the compiler may be less willingly to inline the callees here. While it has not been an issue, a solution I can think of is to have something like `ConstantSupport::evaluate` in which the compiler will try to inline infinitely expecting constant-folding similar to how a `constexpr` variable behaves in C++ (and maybe bail-out compilation if the final result is not a constant, too). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1909272480 From duke at openjdk.org Thu Jan 25 03:59:33 2024 From: duke at openjdk.org (Liming Liu) Date: Thu, 25 Jan 2024 03:59:33 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> Message-ID: On Wed, 10 Jan 2024 07:30:52 GMT, Thomas Stuefe wrote: >> Liming Liu has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. > > Maybe a stupid question, but if we are still worried about concurrent use of memory that is in the process of being madvised, could we not just limit this technique to initialization time? > > I would expect most uses of pretouch to go together with -Xmx = -Xms, and to happen before mutators start. Hi, @tstuefe , @jdksjolen & @kimbarrett . Could you please take a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1909302895 From dholmes at openjdk.org Thu Jan 25 05:08:26 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 25 Jan 2024 05:08:26 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: <-msFouQp2kpWPf6LTKgbDAeLPUkfET6wVesLbAz-6T4=.54ca377c-2e49-4229-a060-daa34485eead@github.com> References: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> <9iDFu8I4w_i1Uso5q7oEi0Le1JvgDNgNyuSZlmKQiuE=.5739d448-fc73-4bcf-bec8-26b3a1b75d21@github.com> <-msFouQp2kpWPf6LTKgbDAeLPUkfET6wVesLbAz-6T4=.54ca377c-2e49-4229-a060-daa34485eead@github.com> Message-ID: On Wed, 24 Jan 2024 19:37:40 GMT, Paul Sandoz wrote: >> It is still weird to talk about expressions at this level. We really check if the value is constant, like the method name suggests now. Yes, this implicitly tests that the expression that produced that value is fully constant-folded. But that's a detail that we do not need to capture here. Let's rename `expr` -> `val`, and tighten up the javadoc for the method to mention we only test the constness of the final value. > > I agree. All values are produced by evaluating expressions. In this case we want to query whether a value produced by the compiler evaluating its expression is a constant value (inputs to the expression are constants and the expression had no material side-effects). Meaning if the method returns true then we could use that knowledge in subsequent expressions that may also produce constants or some specific behavior. > the method compilation has the expression in its original form So the JIT analyses the bytecode used to place the result on the call stack, before the call, and from that determines if the expression were a constant? This kind of self-analysis is not something I was aware of. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1465846860 From kbarrett at openjdk.org Thu Jan 25 05:41:43 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 05:41:43 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() [v3] In-Reply-To: References: Message-ID: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into avoid-ptr-raw-compare - tidy CLD remove_handle - prefer is_empty ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17510/files - new: https://git.openjdk.org/jdk/pull/17510/files/19905134..d02781fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17510&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17510&range=01-02 Stats: 2190 lines in 125 files changed: 1498 ins; 373 del; 319 mod Patch: https://git.openjdk.org/jdk/pull/17510.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17510/head:pull/17510 PR: https://git.openjdk.org/jdk/pull/17510 From kbarrett at openjdk.org Thu Jan 25 05:46:35 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 05:46:35 GMT Subject: RFR: 8324242: Avoid null check for OopHandle::ptr_raw() [v3] In-Reply-To: References: <32d_CojTj7M_Ud6r-OGsnNWthvEYrh1I6PaOgosTQKc=.a9b90f0d-5129-4ab7-9b81-a65869c3bb05@github.com> Message-ID: On Tue, 23 Jan 2024 19:47:21 GMT, Aleksey Shipilev wrote: > Looks okay. Pity we were not able to eliminate the `ptr_raw` use completely. Some of the remaining would require a lot more preparatory work, if possible at all. Thanks for reviews, @jdksjolen , @shipilev , and @coleenp . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17510#issuecomment-1909394656 From kbarrett at openjdk.org Thu Jan 25 05:46:36 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 05:46:36 GMT Subject: Integrated: 8324242: Avoid null check for OopHandle::ptr_raw() In-Reply-To: References: Message-ID: On Sun, 21 Jan 2024 07:29:45 GMT, Kim Barrett wrote: > Please review this change to use OopHandle::is_empty() rather than comparing > the result of OopHandle::ptr_raw() with nullptr. While equivalent, the former > is the intended API for such checks. ptr_raw should only be used directly > where it is actually needed. > > Testing: mach5 tier1. This pull request has now been integrated. Changeset: 3059c3b6 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/3059c3b69ec8fb7cefd740bc2eb52b5ca5390ae1 Stats: 12 lines in 4 files changed: 0 ins; 1 del; 11 mod 8324242: Avoid null check for OopHandle::ptr_raw() Reviewed-by: shade, jsjolen, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/17510 From kbarrett at openjdk.org Thu Jan 25 05:51:37 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 05:51:37 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle [v2] In-Reply-To: References: Message-ID: > Please review this change to the lazy initialization of the MemoryManager > object and the associated MemoryPool objects. > > They previously used an atomic access to the respective OopHandle member > holding the associated Java object as the is-initialized sentinal, testing > whether the handle was empty or had an associated OopStorage entry. When > empty, initialization was performed using a lock to prevent races. > > Now they use a separate atomic is-initialized flag as the sentinal. > > As a result, the support for atomic access to an OopHandle's underlying handle > (via a translator) is no longer needed and is removed. > > While there, I moved the allocation of the associated OopStorage entries out > from under the Management_lock. > > Testing: mach5 tier1 > > A couple of notes for reviewers. > > Once initialized with a Java object recorded in the associated OopHandle, the > OopHandle and the value recorded therein is never changed. > > The old is-initialized check makes use of OopHandle::resolve returning null if > either the handle is empty (has no OopStorage entry yet) or the OopStorage > entry contains null. The latter never happens in this case. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: aboldtch review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17533/files - new: https://git.openjdk.org/jdk/pull/17533/files/b2a295ad..e9b455aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17533&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17533&range=00-01 Stats: 4 lines in 3 files changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17533.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17533/head:pull/17533 PR: https://git.openjdk.org/jdk/pull/17533 From kbarrett at openjdk.org Thu Jan 25 05:51:40 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 05:51:40 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 11:26:10 GMT, Axel Boldt-Christmas wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> aboldtch review > > src/hotspot/share/services/memoryManager.cpp line 147: > >> 145: } else { >> 146: // Record the object we created via call_special. >> 147: _memory_mgr_obj = mgr_handle; > > Could assert that `_memory_mgr_obj.is_empty()`. Done. > src/hotspot/share/services/memoryManager.hpp line 66: > >> 64: >> 65: public: >> 66: virtual ~MemoryManager() = default; // FIXME > > Was this added because we are deleting these objects through the base class pointer? Or just in case? > > Regardless what is the `FIXME` comment for? Oops, that was a piece of another change that I started but abandoned for now. Removed. > src/hotspot/share/services/memoryPool.cpp line 142: > >> 140: } else { >> 141: // Record the object we created via call_special. >> 142: _memory_pool_obj = pool_handle; > > Could assert that `_memory_pool_obj.is_empty()`. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17533#discussion_r1465870877 PR Review Comment: https://git.openjdk.org/jdk/pull/17533#discussion_r1465871728 PR Review Comment: https://git.openjdk.org/jdk/pull/17533#discussion_r1465870974 From mdoerr at openjdk.org Thu Jan 25 06:46:29 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 25 Jan 2024 06:46:29 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: <9GolX3m7SkG4Fs0KTN5qMRxVK47eAhPLdmlaO3oGSKc=.736c3053-abc5-45ce-bd54-85c4f70c3fc9@github.com> On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. On linux, the time for "Purge Unlinked NMethods" goes down when I comment out `delete ic->data();` and ignore the memory leak. (MacOS seems to be ok with it.) Adding trace code to `purge_ic_callsites` shows that we often have 0 or 2 ICData instances, sometimes up to 30 ones. It would be good to think a bit about the allocation scheme. Some ideas would be - Allocate ICData in an array per nmethod instead of individually. That should help to some degree and also improve data locality (and hence cache efficiency). Would also save iterating over the relocations. It's not very complex. - Instead of freeing ICData instances, we could enqueue them and either reuse or free them during a concurrent phase. This may be a bit complicated. Not sure if it's worth it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1909448311 From roland at openjdk.org Thu Jan 25 07:44:28 2024 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 25 Jan 2024 07:44:28 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Wed, 24 Jan 2024 18:56:15 GMT, Aleksey Shipilev wrote: > > Naive question: the right way to use this would be almost invariably be like this: > > ``` > > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { > > // fast-path > > } > > // slow path > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Right? > > Yes, I think so. But then whatever is in the fast path and `fooHasCertainStaticProperties` are never profiled because never executed by the interpreter or c1. So `fooHasCertainStaticProperties` will likely not be inlined and c2 will do a poor (or rather not as good as you'd like) job of compiling whatever is in the fast path. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1909531426 From aboldtch at openjdk.org Thu Jan 25 07:47:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 07:47:29 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 17:25:20 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Preloads markWord unconditionally >> - Revert "Add preload_mark to MacroAssembler::lightweight_lock" >> >> This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. > > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 381: > >> 379: assert(oopDesc::mark_offset_in_bytes() == 0, "required to avoid lea"); >> 380: orr(t, mark, markWord::unlocked_value); >> 381: // Release to satisfy the JMM. > > I don't know what this comment is trying to say. > in the cmpxchg, is 't' the address of the markWord? The `Release to satisfy the JMM.`? It refers to the CAS only having release semantics, which is enough to satisfy the java memory model. t contains the new value which is `t = mark | 0b01` mark is the markWord (value not address) loaded above `ldr(mark, Address(obj, oopDesc::mark_offset_in_bytes()));` obj is the address of the object and because `oopDesc::mark_offset_in_bytes() ==0` it is also the address of the markWord. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465956986 From aboldtch at openjdk.org Thu Jan 25 08:03:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 08:03:29 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 17:31:47 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Preloads markWord unconditionally >> - Revert "Add preload_mark to MacroAssembler::lightweight_lock" >> >> This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. > > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 455: > >> 453: // The owner may be anonymous and we removed the last obj entry in >> 454: // the lock-stack. This loses the information about the owner. >> 455: // Write the thread to the owner field so the runtime knows the owner. > > Is this necessary here also? Previous checks and slow path code in the runtime has already set the owner, if I understand correctly. After popping the last oop of the lock stack we do the `tbnz(mark, exact_log2(markWord::monitor_value), inflated);` check. If this happen the owner will be anonymous. Other solutions would be either: 1. Push the oop back and jump to the runtime. (Would make C2 anonymous owner agnostic). 2. Fix the owner only in this control flow, not in every inflated slow path exit. The first seems alright as well. It is more like what x86 evolved into doing (where it elides this specific check). Both solutions make the inflated unlock cleaner removes a branch (can branch directly to the slow path). The second does seems does create a more complex entry to the inflated unlock, does not seem worth it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1465975881 From aboldtch at openjdk.org Thu Jan 25 08:12:32 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 08:12:32 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Wed, 24 Jan 2024 12:13:45 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 8 more: https://git.openjdk.org/jdk/compare/1a78ceb5...bc214b8d > > src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 63: > >> 61: testl(hdr, JVM_ACC_IS_VALUE_BASED_CLASS); >> 62: jcc(Assembler::notZero, slow_case); >> 63: } else if (LockingMode == LM_LIGHTWEIGHT) { > > Same question as in the aarch64 version, why is it useful to move the header-load here? I'll do the same unconditional preload change as in aarch64. To keep things simple. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1465984342 From aboldtch at openjdk.org Thu Jan 25 08:25:30 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 08:25:30 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> On Wed, 24 Jan 2024 12:21:16 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 8 more: https://git.openjdk.org/jdk/compare/ef08ca7a...bc214b8d > > src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 144: > >> 142: lightweight_unlock(obj, disp_hdr, r15_thread, hdr, slow_case); >> 143: #else >> 144: // This relies on the implementation of lightweight_unlock knowing that it > > I wonder if is would be less brittle (fewer dependencies), if we didn't pass thread as register into lightweight_unlock() and keep the thread-loading and register-shuffling in that method? Same (perhaps) for lightweight_loc(). The only annoying thing is that the generate native wrapper x86_32 path has a dedicated thread register. Have to either signal this to lightweight_{unlock,lock} or just reload the thread in this path. I will see if I can find a cleaner solution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1465997990 From stuefe at openjdk.org Thu Jan 25 08:30:41 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 25 Jan 2024 08:30:41 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v29] In-Reply-To: References: Message-ID: <3MHoDMkp-_AHrKb6z9fEMjP1RbiHleGBguNxXKu9_kw=.22fbb7ba-8fc7-4494-b52d-6ae7936a1eda@github.com> On Mon, 22 Jan 2024 06:39:52 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with two additional commits since the last revision: > > - Use TestThreadGroup > - Set it as default before parsing I like this version. Some nits remain. Thank you for your patience. src/hotspot/os/linux/globals_linux.hpp line 96: > 94: \ > 95: product(bool, UseMadvPopulateWrite, false, DIAGNOSTIC, \ > 96: "Use MADV_POPULATE_WRITE in os::pd_pretouch_memory.") \ I would make this default true. We need a fallback mechanism if we encounter problems and we want to exclude the madvise as a possible cause. But seeing that the perf gains are real and significant, I would enable it by default. src/hotspot/os/linux/os_linux.cpp line 2972: > 2970: ", %d) failed; error='%s' (errno=%d)", > 2971: p2i(first), len, MADV_POPULATE_WRITE, > 2972: os::strerror(err), err); What other things can go wrong here beside missing kernel support? Unconditional log output (with log_warning) is tricky. Many tools parse the JVM output and are thrown off by unexpected content. That's why we restrict log_warning to the small band of "stuff that can go wrong at a customer but it is so severe we really need to tell the customer right now". Stuff that should never go wrong should be assert()ed, or possibly guarantee()'d. Stuff that can go wrong but is not as severe, should be warned about at a lower level. In this case, output may get flooded with warnings if we continue running the VM and repeat the pretouch attempts with other areas. src/hotspot/os/linux/os_linux.cpp line 4402: > 4400: > 4401: // Check the availability of MADV_POPULATE_WRITE. > 4402: FLAG_SET_DEFAULT(UseMadvPopulateWrite, (::madvise(0, 0, MADV_POPULATE_WRITE) == 0)); Can we delay this to the first attempt? Switch it off if the first attempt returns EINVAL? Every system call saved at startup is good. ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15781#pullrequestreview-1843071110 PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1465987597 PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1465998092 PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1466001305 From aboldtch at openjdk.org Thu Jan 25 08:34:33 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 08:34:33 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Wed, 24 Jan 2024 12:30:54 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 8 more: https://git.openjdk.org/jdk/compare/f3b43e70...bc214b8d > > src/hotspot/cpu/x86/x86_64.ad line 12438: > >> 12436: format %{ "fastlock $object,$box\t! kills $box,$tmp,$scr" %} >> 12437: ins_encode %{ >> 12438: __ fast_lock_lightweight($object$$Register, $box$$Register, $tmp$$Register, $scr$$Register, r15_thread); > > It is slightly confusing that the names here don't match the naming in fast_lightweight_lock/unlock. You might want to fix that. I'll fix that. Went down a rabbit hole trying to figure out adlc and register allocation. I do not know why they specify `rbx` for `box`. Is it because they want to use `USE_KILL` or are they using `USE_KILL` because they specify `rbx` for `box`. It feels like this specification could be improved. The only requirement is that one tmp register is `rax`. But I will leave that to another RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466008860 From aboldtch at openjdk.org Thu Jan 25 08:47:32 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 08:47:32 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Tue, 23 Jan 2024 18:59:21 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 18 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 8 more: https://git.openjdk.org/jdk/compare/7c1570b4...bc214b8d > > src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 77: > >> 75: >> 76: int C2FastUnlockLightweightStub::max_size() const { >> 77: return 128; > > Is this still 128? This is just used to preallocate the buffer when emitting stubs. Unused space gets truncated / used by the next stubs emission. (If I recall correctly the buffer is grown with at least 4KB at a time if offset() + next_stub->max_size() > buffer_end.) I remember it being somewhere around ~100 bytes depending on ASSERT. So 128 seemed like a good enough number to ensure that the stub could always be emitted. But maybe there is value in being more precise so that (assembler) changes which change (grow) the code emission size are captured early. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466024145 From aboldtch at openjdk.org Thu Jan 25 09:16:43 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 09:16:43 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: References: Message-ID: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Update variable names in ad files - Preload markWord unconditionally ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/bc214b8d..6de1d69b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=11-12 Stats: 26 lines in 4 files changed: 4 ins; 4 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From shade at openjdk.org Thu Jan 25 09:30:28 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jan 2024 09:30:28 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: <-quimoVBzziosvz8NKyrp0fop7T9ZRDg4SJo1wif6aw=.77948f6f-0728-4e57-ae98-5c5472f435ee@github.com> References: <-quimoVBzziosvz8NKyrp0fop7T9ZRDg4SJo1wif6aw=.77948f6f-0728-4e57-ae98-5c5472f435ee@github.com> Message-ID: On Wed, 24 Jan 2024 21:28:29 GMT, Leonid Mesnik wrote: >> Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. >> >> I provisionally call this flag `external-dep`, but I am open for other suggestions. >> >> Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. >> >> Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. >> >> Additional testing: >> - [x] `make test TEST=applications/` fails >> - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests > > Marked as reviewed by lmesnik (Reviewer). @lmesnik, you good with the keyword name? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1909740587 From epeter at openjdk.org Thu Jan 25 09:51:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 25 Jan 2024 09:51:59 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v25] In-Reply-To: References: Message-ID: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks (with implicit no-safepoint-verifiers) at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Testing** > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: moving code for Coleen ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/ff581b05..96af505c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=23-24 Stats: 62 lines in 5 files changed: 34 ins; 24 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From tschatzl at openjdk.org Thu Jan 25 10:40:43 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 Jan 2024 10:40:43 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 06:25:20 GMT, Erik ?sterlund wrote: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Something I found while looking a bit at the code. src/hotspot/share/code/nmethod.cpp line 1231: > 1229: assert(cb != nullptr, "destination not in CodeBlob?"); > 1230: nmethod* nm = cb->as_nmethod_or_null(); > 1231: if( nm != nullptr ) { Maybe fix this while in the area similar to other places. src/hotspot/share/code/nmethod.cpp line 1470: > 1468: > 1469: purge_ic_callsites(); > 1470: (Github does not allow me to attach this comment to the correct place): At the start of this method, there is some comment about // Already unlinked. It can be invoked twice because concurrent code cache // unloading might need to restart when inline cache cleaning fails due to // running out of ICStubs, which can only be refilled at safepoints This comment and the whole mechanism to prevent this may be outdated since there are no ICStubs and the associated safepoints any more; maybe it is worth keeping the flag to provide an assert though? I did not check the code flow yet, just going from the comment. ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17495#pullrequestreview-1843339869 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1466151049 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1466150329 From tschatzl at openjdk.org Thu Jan 25 10:40:45 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 Jan 2024 10:40:45 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: <-2iWeTY5A58iDzkQT_p7pEQk4T0uUPbI6ykAA5AEnWs=.beac9674-7d48-4c59-9dbe-a74ed44e0322@github.com> On Thu, 25 Jan 2024 10:20:47 GMT, Thomas Schatzl wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > src/hotspot/share/code/nmethod.cpp line 1470: > >> 1468: >> 1469: purge_ic_callsites(); >> 1470: > > (Github does not allow me to attach this comment to the correct place): > At the start of this method, there is some comment about > > // Already unlinked. It can be invoked twice because concurrent code cache > // unloading might need to restart when inline cache cleaning fails due to > // running out of ICStubs, which can only be refilled at safepoints > > This comment and the whole mechanism to prevent this may be outdated since there are no ICStubs and the associated safepoints any more; maybe it is worth keeping the flag to provide an assert though? > I did not check the code flow yet, just going from the comment. I think the flag is still required, just the comment needs to be fixed then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1466171782 From aboldtch at openjdk.org Thu Jan 25 10:44:39 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 10:44:39 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle [v2] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 05:51:37 GMT, Kim Barrett wrote: >> Please review this change to the lazy initialization of the MemoryManager >> object and the associated MemoryPool objects. >> >> They previously used an atomic access to the respective OopHandle member >> holding the associated Java object as the is-initialized sentinal, testing >> whether the handle was empty or had an associated OopStorage entry. When >> empty, initialization was performed using a lock to prevent races. >> >> Now they use a separate atomic is-initialized flag as the sentinal. >> >> As a result, the support for atomic access to an OopHandle's underlying handle >> (via a translator) is no longer needed and is removed. >> >> While there, I moved the allocation of the associated OopStorage entries out >> from under the Management_lock. >> >> Testing: mach5 tier1 >> >> A couple of notes for reviewers. >> >> Once initialized with a Java object recorded in the associated OopHandle, the >> OopHandle and the value recorded therein is never changed. >> >> The old is-initialized check makes use of OopHandle::resolve returning null if >> either the handle is empty (has no OopStorage entry yet) or the OopStorage >> entry contains null. The latter never happens in this case. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > aboldtch review Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17533#pullrequestreview-1843384287 From jsjolen at openjdk.org Thu Jan 25 10:58:45 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Jan 2024 10:58:45 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v29] In-Reply-To: <3MHoDMkp-_AHrKb6z9fEMjP1RbiHleGBguNxXKu9_kw=.22fbb7ba-8fc7-4494-b52d-6ae7936a1eda@github.com> References: <3MHoDMkp-_AHrKb6z9fEMjP1RbiHleGBguNxXKu9_kw=.22fbb7ba-8fc7-4494-b52d-6ae7936a1eda@github.com> Message-ID: On Thu, 25 Jan 2024 08:25:20 GMT, Thomas Stuefe wrote: >> Liming Liu has updated the pull request incrementally with two additional commits since the last revision: >> >> - Use TestThreadGroup >> - Set it as default before parsing > > src/hotspot/os/linux/os_linux.cpp line 4402: > >> 4400: >> 4401: // Check the availability of MADV_POPULATE_WRITE. >> 4402: FLAG_SET_DEFAULT(UseMadvPopulateWrite, (::madvise(0, 0, MADV_POPULATE_WRITE) == 0)); > > Can we delay this to the first attempt? Switch it off if the first attempt returns EINVAL? Every system call saved at startup is good. Is that possible? Won't that ~clobber~ replace the user-supplied parameter? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1466193861 From sroy at openjdk.org Thu Jan 25 11:06:42 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Thu, 25 Jan 2024 11:06:42 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v9] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 24 Jan 2024 07:30:27 GMT, Thomas Stuefe wrote: > For me the unresolved question is still: > > * do we want an unconditional load of *.a for a given *.so (have yet to see any documentation for this a-file duality) Yes. The documentation link - https://www.ibm.com/docs/en/aix/7.3?topic=memory-shared-objects-run-time-linking The text **In dynamic mode, input files specified with the -l flag may end in .so, as well as in .a. That is, a reference to -lfoo is satisfied by the first libfoo.so or libfoo.a found in any of the directories being searched. Dynamic mode is in effect by default unless the -bstatic option is used.** https://www.ibm.com/docs/en/aix/7.3?topic=l-ld-command Archive files are composite objects, which usually contain import files and object files, including shared objects. If an archive file contains another archive file or a member whose type is not recognized, the ld command issues a warning and ignores the unrecognized member. If an object file contained in an archive file has the F_LOADONLY bit set in the XCOFF header, the ld command ignores the member. This bit is usually used to designate old versions of shared objects that remain in the archive file to allow existing applications to load and run. New applications link with the new version of the shared object, that is, another member of the archive. > * if we do, do we want that to be bidirectional? Someone specifies *.a, do we want to attempt to load *.so? > Considering the different scenarios, loading .a after .so failure should suffice. I got a chance to look at the right file in OpenJ9-omr ,which has a native code which does an attempt to load archive files after trying to load .so files. This code was always there and it explains why the issue did not occur in Semeru, which is derived from this repository. > When in doubt, we should just mimic what OpenJ9 is doing on AIX. But I would like a clear documentation as a comment in os_aix.cpp explaining the logic and referencing the relevant OpenJ9 files. > Any example comment you can refer ? I mean i just mention the file name in OpenJ9 and explain the logic ? Let me know for any further clarifications ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1909927553 From aph at openjdk.org Thu Jan 25 11:07:39 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 25 Jan 2024 11:07:39 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 16:06:49 GMT, Axel Boldt-Christmas wrote: >> Implements the aarch64 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Preloads markWord unconditionally > - Revert "Add preload_mark to MacroAssembler::lightweight_lock" > > This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 277: > 275: orr(mark, mark, markWord::unlocked_value); > 276: eor(t, mark, markWord::unlocked_value); > 277: // Acquire to satisfy the JMM. This comment is borderline unnecessary, IMO. Acquiring a lock implies an acquire barrier, releasing a lock implies a release barrier. Roach motel semantics imply that no more barriers are required or helpful. Anyone who is programming at the level needed to understand this code should already know the basic facts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466205070 From stuefe at openjdk.org Thu Jan 25 11:11:44 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 25 Jan 2024 11:11:44 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v29] In-Reply-To: References: <3MHoDMkp-_AHrKb6z9fEMjP1RbiHleGBguNxXKu9_kw=.22fbb7ba-8fc7-4494-b52d-6ae7936a1eda@github.com> Message-ID: On Thu, 25 Jan 2024 10:55:24 GMT, Johan Sj?len wrote: >> src/hotspot/os/linux/os_linux.cpp line 4402: >> >>> 4400: >>> 4401: // Check the availability of MADV_POPULATE_WRITE. >>> 4402: FLAG_SET_DEFAULT(UseMadvPopulateWrite, (::madvise(0, 0, MADV_POPULATE_WRITE) == 0)); >> >> Can we delay this to the first attempt? Switch it off if the first attempt returns EINVAL? Every system call saved at startup is good. > > Is that possible? Won't that ~clobber~ replace the user-supplied parameter? yes you are right. We should only try madvise if the switch is enabled. And we should do it here, not as I suggested earlier on demand. Because changing flags after init is unexpected (eg PrintFlagsFinal). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1466209932 From rkennke at openjdk.org Thu Jan 25 11:24:38 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 11:24:38 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Thu, 25 Jan 2024 08:32:05 GMT, Axel Boldt-Christmas wrote: > I'll fix that. Went down a rabbit hole trying to figure out adlc and register allocation. I do not know why they specify `rbx` for `box`. Is it because they want to use `USE_KILL` or are they using `USE_KILL` because they specify `rbx` for `box`. It feels like this specification could be improved. The only requirement is that one tmp register is `rax`. But I will leave that to another RFE. I don't know this, either. This might well be historical. I seem to remember that a much earlier version of that code shared code with interpreter version and used fixed registers there, but I am not sure. If it's not needed for the new code then don't do it. I don't think it has anything to do with USE_KILL, though. USE_KILL only means that it's using the input value of the register *and* overwrites (kills) it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466224835 From aboldtch at openjdk.org Thu Jan 25 11:50:48 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 11:50:48 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v10] In-Reply-To: References: Message-ID: <0QSuwvim92k8hKdnbAY4AjaJC4DQGK5hRnTYJqSLGWM=.db600a00-bad6-41a7-b2ee-ee6754364434@github.com> > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Drop memory order comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/7d2584e8..e4d5dcd7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=08-09 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From aboldtch at openjdk.org Thu Jan 25 11:50:49 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 11:50:49 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 11:04:35 GMT, Andrew Haley wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Preloads markWord unconditionally >> - Revert "Add preload_mark to MacroAssembler::lightweight_lock" >> >> This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. > > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 277: > >> 275: orr(mark, mark, markWord::unlocked_value); >> 276: eor(t, mark, markWord::unlocked_value); >> 277: // Acquire to satisfy the JMM. > > This comment is borderline unnecessary, IMO. Acquiring a lock implies an acquire barrier, releasing a lock implies a release barrier. Roach motel semantics imply that no more barriers are required or helpful. Anyone who is programming at the level needed to understand this code should already know the basic facts. Fair. The comment was more to signal `Only Acquire` and `Only Release` as it differs from how it used to be implement where both enter and exit had explicit acquire+release. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466251522 From eosterlund at openjdk.org Thu Jan 25 12:27:50 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 Jan 2024 12:27:50 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v2] In-Reply-To: References: Message-ID: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Remove inaccurate comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17495/files - new: https://git.openjdk.org/jdk/pull/17495/files/cc98cce9..82134e63 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17495.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17495/head:pull/17495 PR: https://git.openjdk.org/jdk/pull/17495 From eosterlund at openjdk.org Thu Jan 25 12:27:52 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 Jan 2024 12:27:52 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v2] In-Reply-To: <-2iWeTY5A58iDzkQT_p7pEQk4T0uUPbI6ykAA5AEnWs=.beac9674-7d48-4c59-9dbe-a74ed44e0322@github.com> References: <-2iWeTY5A58iDzkQT_p7pEQk4T0uUPbI6ykAA5AEnWs=.beac9674-7d48-4c59-9dbe-a74ed44e0322@github.com> Message-ID: On Thu, 25 Jan 2024 10:38:10 GMT, Thomas Schatzl wrote: >> src/hotspot/share/code/nmethod.cpp line 1470: >> >>> 1468: >>> 1469: purge_ic_callsites(); >>> 1470: >> >> (Github does not allow me to attach this comment to the correct place): >> At the start of this method, there is some comment about >> >> // Already unlinked. It can be invoked twice because concurrent code cache >> // unloading might need to restart when inline cache cleaning fails due to >> // running out of ICStubs, which can only be refilled at safepoints >> >> This comment and the whole mechanism to prevent this may be outdated since there are no ICStubs and the associated safepoints any more; maybe it is worth keeping the flag to provide an assert though? >> I did not check the code flow yet, just going from the comment. > > I think the flag is still required, just the comment needs to be fixed then. Good point! I actually think we might not need it any more. I added the _is_unlinked field because of ICStubs causing inline cache cleaning from a concurrent GC to fail because going to the cleaned state requires an ICStub and we are out of memory for it, requiring the GC to request a safepoint and then restart the code cache unloading. If there is still a separate reason why we might call unlink twice on the same nmethod, I don't know why that is. Maybe something new in G1? Having said that, I don't mind keeping the guard to make it more robust. What do you think? Either way, I'll remove the inaccurate comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1466294600 From eosterlund at openjdk.org Thu Jan 25 12:36:38 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 Jan 2024 12:36:38 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v2] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 10:21:16 GMT, Thomas Schatzl wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove inaccurate comment > > src/hotspot/share/code/nmethod.cpp line 1231: > >> 1229: assert(cb != nullptr, "destination not in CodeBlob?"); >> 1230: nmethod* nm = cb->as_nmethod_or_null(); >> 1231: if( nm != nullptr ) { > > Maybe fix this while in the area similar to other places. Sorry, I'm not sure I understand what you would like me to change here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1466307016 From dchuyko at openjdk.org Thu Jan 25 12:38:48 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 25 Jan 2024 12:38:48 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v24] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 42 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 32 more: https://git.openjdk.org/jdk/compare/7a798d3c...0c8f11bc ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=23 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From tschatzl at openjdk.org Thu Jan 25 12:44:39 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 Jan 2024 12:44:39 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v2] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 12:33:17 GMT, Erik ?sterlund wrote: >> src/hotspot/share/code/nmethod.cpp line 1231: >> >>> 1229: assert(cb != nullptr, "destination not in CodeBlob?"); >>> 1230: nmethod* nm = cb->as_nmethod_or_null(); >>> 1231: if( nm != nullptr ) { >> >> Maybe fix this while in the area similar to other places. > > Sorry, I'm not sure I understand what you would like me to change here. Suggestion: if (nm != nullptr ) { Sorry. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1466315808 From mcimadamore at openjdk.org Thu Jan 25 12:55:36 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 25 Jan 2024 12:55:36 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Thu, 25 Jan 2024 07:41:27 GMT, Roland Westrelin wrote: > > > Naive question: the right way to use this would be almost invariably be like this: > > > ``` > > > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { > > > // fast-path > > > } > > > // slow path > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Right? > > > > > > Yes, I think so. > > But then whatever is in the fast path and `fooHasCertainStaticProperties` are never profiled because never executed by the interpreter or c1. So `fooHasCertainStaticProperties` will likely not be inlined and c2 will do a poor (or rather not as good as you'd like) job of compiling whatever is in the fast path. I suppose perhaps it is implied that `fooHasCertainStaticProperties` should have `@ForceInline` ? But yes, there seems to be several assumptions in how this logic is supposed to be used, and at the moment, it seems to me more of a footgun than something actually useful (but I admit my ignorance on the subject). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1910140980 From aph at openjdk.org Thu Jan 25 13:03:38 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 25 Jan 2024 13:03:38 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: <2CCn2joXWLL48BkJVx3tJgDrbqc7gAF5N41O6GFqZTI=.173c71d1-9720-49f0-8e6b-bbf5e0e7db59@github.com> On Thu, 25 Jan 2024 11:45:58 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 277: >> >>> 275: orr(mark, mark, markWord::unlocked_value); >>> 276: eor(t, mark, markWord::unlocked_value); >>> 277: // Acquire to satisfy the JMM. >> >> This comment is borderline unnecessary, IMO. Acquiring a lock implies an acquire barrier, releasing a lock implies a release barrier. Roach motel semantics imply that no more barriers are required or helpful. Anyone who is programming at the level needed to understand this code should already know the basic facts. > > Fair. The comment was more to signal `Only Acquire` and `Only Release` as it differs from how it used to be implement where both enter and exit had explicit acquire+release. LOL! I think I originally wrote the old implementation. Mind you, it was a long time ago, and I know better now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466342009 From rkennke at openjdk.org Thu Jan 25 13:16:38 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:16:38 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v10] In-Reply-To: <0QSuwvim92k8hKdnbAY4AjaJC4DQGK5hRnTYJqSLGWM=.db600a00-bad6-41a7-b2ee-ee6754364434@github.com> References: <0QSuwvim92k8hKdnbAY4AjaJC4DQGK5hRnTYJqSLGWM=.db600a00-bad6-41a7-b2ee-ee6754364434@github.com> Message-ID: On Thu, 25 Jan 2024 11:50:48 GMT, Axel Boldt-Christmas wrote: >> Implements the aarch64 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Drop memory order comments I think it's good now. I only have a comment in aarch64.ad, please decide for yourself if you want to do anything about it. ;-) src/hotspot/cpu/aarch64/aarch64.ad line 16477: > 16475: predicate(LockingMode == LM_LIGHTWEIGHT); > 16476: match(Set cr (FastLock object box)); > 16477: effect(TEMP tmp, TEMP tmp2); I believe you should declare box as TEMP here, because that is how it is used. It probably works by accident because box is only used by us and has no meaning outside of the locking code. Also, I think there is no need to match the box register to the 2nd FastLock input. If you change that, you might need to add a match_edge() in FastLockNode and FastUnlockNode in locknode.hpp that excludes the box argument when LW locking is on. I believe the purpuse of the box node/register is to track the stack-location of the stack-lock in stack-locking, and ensure that unlock is getting the same location there as the corresponding lock. Thinking about this... this should be done as a follow-up or else it becomes too intrusive for this change. ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16608#pullrequestreview-1843709973 PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466356507 From rkennke at openjdk.org Thu Jan 25 13:21:41 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:21:41 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> Message-ID: On Thu, 25 Jan 2024 08:22:21 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 144: >> >>> 142: lightweight_unlock(obj, disp_hdr, r15_thread, hdr, slow_case); >>> 143: #else >>> 144: // This relies on the implementation of lightweight_unlock knowing that it >> >> I wonder if is would be less brittle (fewer dependencies), if we didn't pass thread as register into lightweight_unlock() and keep the thread-loading and register-shuffling in that method? Same (perhaps) for lightweight_loc(). > > The only annoying thing is that the generate native wrapper x86_32 path has a dedicated thread register. Have to either signal this to lightweight_{unlock,lock} or just reload the thread in this path. > > I will see if I can find a cleaner solution. Uh I see. This whole loading of the thread in x86_32 made me think (a while back) to not have any asm 'fast'-paths for x86_64 to begin with. IIRC, get_thread() calls into the runtime anyway, and if we do that anyway (sometimes repeatedly), we might just as well handle the whole locking there. It's annoying. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466369287 From aboldtch at openjdk.org Thu Jan 25 13:23:31 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 13:23:31 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v10] In-Reply-To: References: <0QSuwvim92k8hKdnbAY4AjaJC4DQGK5hRnTYJqSLGWM=.db600a00-bad6-41a7-b2ee-ee6754364434@github.com> Message-ID: On Thu, 25 Jan 2024 13:08:28 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> Drop memory order comments > > src/hotspot/cpu/aarch64/aarch64.ad line 16477: > >> 16475: predicate(LockingMode == LM_LIGHTWEIGHT); >> 16476: match(Set cr (FastLock object box)); >> 16477: effect(TEMP tmp, TEMP tmp2); > > I believe you should declare box as TEMP here, because that is how it is used. It probably works by accident because box is only used by us and has no meaning outside of the locking code. Also, I think there is no need to match the box register to the 2nd FastLock input. If you change that, you might need to add a match_edge() in FastLockNode and FastUnlockNode in locknode.hpp that excludes the box argument when LW locking is on. I believe the purpuse of the box node/register is to track the stack-location of the stack-lock in stack-locking, and ensure that unlock is getting the same location there as the corresponding lock. > Thinking about this... this should be done as a follow-up or else it becomes too intrusive for this change. That was what I started doing and came to the same conclusion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466369765 From rkennke at openjdk.org Thu Jan 25 13:28:38 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:28:38 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> References: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> Message-ID: <2w64qPwLv9JSooiQp-7UKXZl7jSS0VnKTpzldUpajfg=.8e542fdc-5b48-4c24-9674-3db4142d0bc1@github.com> On Thu, 25 Jan 2024 09:16:43 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Update variable names in ad files > - Preload markWord unconditionally A few (relatively minor) comments, still. src/hotspot/cpu/x86/x86_32.ad line 13807: > 13805: predicate(LockingMode == LM_LIGHTWEIGHT); > 13806: match(Set cr (FastLock object box)); > 13807: effect(TEMP eax_reg, TEMP tmp, USE_KILL box, TEMP thread); Consider changing USE_KILL box to TEMP box. Same overall considerations (long-term, in a follow-up) as in aarch64. src/hotspot/cpu/x86/x86_32.ad line 13820: > 13818: predicate(LockingMode == LM_LIGHTWEIGHT); > 13819: match(Set cr (FastUnlock object eax_reg)); > 13820: effect(TEMP tmp, USE_KILL eax_reg, TEMP thread); I think USE_KILL eax can also be changed to just TEMP, we're not really using an input here, right? src/hotspot/cpu/x86/x86_64.ad line 12434: > 12432: predicate(LockingMode == LM_LIGHTWEIGHT); > 12433: match(Set cr (FastLock object box)); > 12434: effect(TEMP rax_reg, TEMP tmp, USE_KILL box); Same here. src/hotspot/cpu/x86/x86_64.ad line 12446: > 12444: predicate(LockingMode == LM_LIGHTWEIGHT); > 12445: match(Set cr (FastUnlock object rax_reg)); > 12446: effect(TEMP tmp, USE_KILL rax_reg); And here. ------------- Changes requested by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1843738658 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466374300 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466375341 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466375763 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466375954 From aboldtch at openjdk.org Thu Jan 25 13:34:38 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 13:34:38 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> Message-ID: On Thu, 25 Jan 2024 13:18:43 GMT, Roman Kennke wrote: >> The only annoying thing is that the generate native wrapper x86_32 path has a dedicated thread register. Have to either signal this to lightweight_{unlock,lock} or just reload the thread in this path. >> >> I will see if I can find a cleaner solution. > > Uh I see. This whole loading of the thread in x86_32 made me think (a while back) to not have any asm 'fast'-paths for x86_64 to begin with. IIRC, get_thread() calls into the runtime anyway, and if we do that anyway (sometimes repeatedly), we might just as well handle the whole locking there. It's annoying. There is no clear satisfying solution for this. Either multiple function names `lightweight_{unlock,lock}_with_thread` , using an extra bool argument to signal that the thread is loaded, or overload the type one with `Register` the other with `Register*`. I tried something like a00f2e9e7f9b4d1abdcd5931ff8ba62c1d2de868 and 8eaa53e5cd9d3b17d16516af599f451ac4531c8b ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466384811 From aboldtch at openjdk.org Thu Jan 25 13:40:37 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 13:40:37 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Thu, 25 Jan 2024 11:21:47 GMT, Roman Kennke wrote: >> I'll fix that. Went down a rabbit hole trying to figure out adlc and register allocation. I do not know why they specify `rbx` for `box`. Is it because they want to use `USE_KILL` or are they using `USE_KILL` because they specify `rbx` for `box`. It feels like this specification could be improved. The only requirement is that one tmp register is `rax`. But I will leave that to another RFE. > >> I'll fix that. Went down a rabbit hole trying to figure out adlc and register allocation. I do not know why they specify `rbx` for `box`. Is it because they want to use `USE_KILL` or are they using `USE_KILL` because they specify `rbx` for `box`. It feels like this specification could be improved. The only requirement is that one tmp register is `rax`. But I will leave that to another RFE. > > I don't know this, either. This might well be historical. I seem to remember that a much earlier version of that code shared code with interpreter version and used fixed registers there, but I am not sure. If it's not needed for the new code then don't do it. I don't think it has anything to do with USE_KILL, though. USE_KILL only means that it's using the input value of the register *and* overwrites (kills) it. Well if I remember correctly from my experiments with this `USE_KILL` requires a bound register. Will get a compilation error with the change `rbx_RegP` -> `rRegP`, saying something like `only bound registers can be killed` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466392214 From rkennke at openjdk.org Thu Jan 25 13:45:29 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:45:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> Message-ID: On Thu, 25 Jan 2024 13:31:50 GMT, Axel Boldt-Christmas wrote: >> Uh I see. This whole loading of the thread in x86_32 made me think (a while back) to not have any asm 'fast'-paths for x86_64 to begin with. IIRC, get_thread() calls into the runtime anyway, and if we do that anyway (sometimes repeatedly), we might just as well handle the whole locking there. It's annoying. > > There is no clear satisfying solution for this. Either multiple function names `lightweight_{unlock,lock}_with_thread` , using an extra bool argument to signal that the thread is loaded, or overload the type one with `Register` the other with `Register*`. > > I tried something like a00f2e9e7f9b4d1abdcd5931ff8ba62c1d2de868 and 8eaa53e5cd9d3b17d16516af599f451ac4531c8b Ah, nevermind. Leave it with extra argument for thread, then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466397666 From rkennke at openjdk.org Thu Jan 25 13:45:31 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:45:31 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Thu, 25 Jan 2024 13:37:59 GMT, Axel Boldt-Christmas wrote: >>> I'll fix that. Went down a rabbit hole trying to figure out adlc and register allocation. I do not know why they specify `rbx` for `box`. Is it because they want to use `USE_KILL` or are they using `USE_KILL` because they specify `rbx` for `box`. It feels like this specification could be improved. The only requirement is that one tmp register is `rax`. But I will leave that to another RFE. >> >> I don't know this, either. This might well be historical. I seem to remember that a much earlier version of that code shared code with interpreter version and used fixed registers there, but I am not sure. If it's not needed for the new code then don't do it. I don't think it has anything to do with USE_KILL, though. USE_KILL only means that it's using the input value of the register *and* overwrites (kills) it. > > Well if I remember correctly from my experiments with this `USE_KILL` requires a bound register. Will get a compilation error with the change `rbx_RegP` -> `rRegP`, saying something like `only bound registers can be killed` Yes, but also with TEMP? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466398050 From aboldtch at openjdk.org Thu Jan 25 13:49:41 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 13:49:41 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: <2w64qPwLv9JSooiQp-7UKXZl7jSS0VnKTpzldUpajfg=.8e542fdc-5b48-4c24-9674-3db4142d0bc1@github.com> References: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> <2w64qPwLv9JSooiQp-7UKXZl7jSS0VnKTpzldUpajfg=.8e542fdc-5b48-4c24-9674-3db4142d0bc1@github.com> Message-ID: On Thu, 25 Jan 2024 13:23:04 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update variable names in ad files >> - Preload markWord unconditionally > > src/hotspot/cpu/x86/x86_32.ad line 13807: > >> 13805: predicate(LockingMode == LM_LIGHTWEIGHT); >> 13806: match(Set cr (FastLock object box)); >> 13807: effect(TEMP eax_reg, TEMP tmp, USE_KILL box, TEMP thread); > > Consider changing USE_KILL box to TEMP box. Same overall considerations (long-term, in a follow-up) as in aarch64. An input cannot be `TEMP` so must change the required input nodes for FastLockNode and FastUnlockNode. Planned this as a followup RFE, would also change the `eBXRegP` / `rbx_RegP ` constraint to `eRegP` and `rRegP`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466402815 From rkennke at openjdk.org Thu Jan 25 13:54:38 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:54:38 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> References: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> Message-ID: On Thu, 25 Jan 2024 09:16:43 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Update variable names in ad files > - Preload markWord unconditionally Looks good to me! Thanks! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1843794726 From rkennke at openjdk.org Thu Jan 25 13:54:41 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Jan 2024 13:54:41 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: References: <3Dl7fDPJjUJbVpsU6F72Tt7iDXToeR7uAUxHeVgiX9o=.32dd65b1-3665-4227-9409-681de07665c0@github.com> <2w64qPwLv9JSooiQp-7UKXZl7jSS0VnKTpzldUpajfg=.8e542fdc-5b48-4c24-9674-3db4142d0bc1@github.com> Message-ID: On Thu, 25 Jan 2024 13:46:36 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/x86_32.ad line 13807: >> >>> 13805: predicate(LockingMode == LM_LIGHTWEIGHT); >>> 13806: match(Set cr (FastLock object box)); >>> 13807: effect(TEMP eax_reg, TEMP tmp, USE_KILL box, TEMP thread); >> >> Consider changing USE_KILL box to TEMP box. Same overall considerations (long-term, in a follow-up) as in aarch64. > > An input cannot be `TEMP` so must change the required input nodes for FastLockNode and FastUnlockNode. > > Planned this as a followup RFE, would also change the `eBXRegP` / `rbx_RegP ` constraint to `eRegP` and `rRegP`. Ok, good then. (What a mess. Looking forward to see this cleaned-up) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466408270 From aboldtch at openjdk.org Thu Jan 25 13:54:42 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 Jan 2024 13:54:42 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Thu, 25 Jan 2024 13:42:39 GMT, Roman Kennke wrote: >> Well if I remember correctly from my experiments with this `USE_KILL` requires a bound register. Will get a compilation error with the change `rbx_RegP` -> `rRegP`, saying something like `only bound registers can be killed` > > Yes, but also with TEMP? `TEMP` does not require a bound register. It will compile but crash later in C2. This is only based on observed behaviour: `TEMP` cannot be specified on an Input. Will crash in register allocation with `assert(opcnt < numopnds) failed: Accessing non-existent operand` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466407833 From qamai at openjdk.org Thu Jan 25 14:01:59 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 25 Jan 2024 14:01:59 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v7] In-Reply-To: References: Message-ID: > Hi, > > This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. > > Please kindly give your opinion as well as your reviews, thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: change expr to val, add examples ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17527/files - new: https://git.openjdk.org/jdk/pull/17527/files/b4445e2e..84f9f7eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17527&range=05-06 Stats: 51 lines in 1 file changed: 28 ins; 4 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/17527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17527/head:pull/17527 PR: https://git.openjdk.org/jdk/pull/17527 From qamai at openjdk.org Thu Jan 25 14:02:00 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 25 Jan 2024 14:02:00 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Thu, 25 Jan 2024 12:52:21 GMT, Maurizio Cimadamore wrote: >>> > Naive question: the right way to use this would be almost invariably be like this: >>> > ``` >>> > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { >>> > // fast-path >>> > } >>> > // slow path >>> > ``` >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > Right? >>> >>> Yes, I think so. >> >> But then whatever is in the fast path and `fooHasCertainStaticProperties` are never profiled because never executed by the interpreter or c1. So `fooHasCertainStaticProperties` will likely not be inlined and c2 will do a poor (or rather not as good as you'd like) job of compiling whatever is in the fast path. > >> > > Naive question: the right way to use this would be almost invariably be like this: >> > > ``` >> > > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { >> > > // fast-path >> > > } >> > > // slow path >> > > ``` >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > Right? >> > >> > >> > Yes, I think so. >> >> But then whatever is in the fast path and `fooHasCertainStaticProperties` are never profiled because never executed by the interpreter or c1. So `fooHasCertainStaticProperties` will likely not be inlined and c2 will do a poor (or rather not as good as you'd like) job of compiling whatever is in the fast path. > > I suppose perhaps it is implied that `fooHasCertainStaticProperties` should have `@ForceInline` ? But yes, there seems to be several assumptions in how this logic is supposed to be used, and at the moment, it seems to me more of a footgun than something actually useful (but I admit my ignorance on the subject). @mcimadamore Yes this is hard to use apart from the simple cases. Considering we have already used this technique in the `MethodHandle` implementation, I think there are valid use cases. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1910271891 From qamai at openjdk.org Thu Jan 25 14:02:01 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 25 Jan 2024 14:02:01 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v5] In-Reply-To: References: <_pAlUJwzkoFkCnQW_IQK-zUkUMMjq6KjZoDldS34CyA=.984549da-d6de-4977-a87f-18a33d58824d@github.com> <5AWq0nDx_AQPwnEp1cMisZ6ytn2ieq9FHDwDQp5A4QQ=.5043ac3e-04bf-4fd8-a680-448f392e5cb1@github.com> <9iDFu8I4w_i1Uso5q7oEi0Le1JvgDNgNyuSZlmKQiuE=.5739d448-fc73-4bcf-bec8-26b3a1b75d21@github.com> <-msFouQp2kpWPf6LTKgbDAeLPUkfET6wVesLbAz-6T4=.54ca377c-2e49-4229-a060-daa34485eead@github.com> Message-ID: On Thu, 25 Jan 2024 05:06:12 GMT, David Holmes wrote: >> I agree. All values are produced by evaluating expressions. In this case we want to query whether a value produced by the compiler evaluating its expression is a constant value (inputs to the expression are constants and the expression had no material side-effects). Meaning if the method returns true then we could use that knowledge in subsequent expressions that may also produce constants or some specific behavior. > >> the method compilation has the expression in its original form > > So the JIT analyses the bytecode used to place the result on the call stack, before the call, and from that determines if the expression were a constant? This kind of self-analysis is not something I was aware of. I see, changed `expr` to `val`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17527#discussion_r1466418470 From mcimadamore at openjdk.org Thu Jan 25 14:51:36 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 25 Jan 2024 14:51:36 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Thu, 25 Jan 2024 12:52:21 GMT, Maurizio Cimadamore wrote: >>> > Naive question: the right way to use this would be almost invariably be like this: >>> > ``` >>> > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { >>> > // fast-path >>> > } >>> > // slow path >>> > ``` >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > Right? >>> >>> Yes, I think so. >> >> But then whatever is in the fast path and `fooHasCertainStaticProperties` are never profiled because never executed by the interpreter or c1. So `fooHasCertainStaticProperties` will likely not be inlined and c2 will do a poor (or rather not as good as you'd like) job of compiling whatever is in the fast path. > >> > > Naive question: the right way to use this would be almost invariably be like this: >> > > ``` >> > > if (isCompileConstant(foo) && fooHasCertainStaticProperties(foo)) { >> > > // fast-path >> > > } >> > > // slow path >> > > ``` >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > Right? >> > >> > >> > Yes, I think so. >> >> But then whatever is in the fast path and `fooHasCertainStaticProperties` are never profiled because never executed by the interpreter or c1. So `fooHasCertainStaticProperties` will likely not be inlined and c2 will do a poor (or rather not as good as you'd like) job of compiling whatever is in the fast path. > > I suppose perhaps it is implied that `fooHasCertainStaticProperties` should have `@ForceInline` ? But yes, there seems to be several assumptions in how this logic is supposed to be used, and at the moment, it seems to me more of a footgun than something actually useful (but I admit my ignorance on the subject). > @mcimadamore Yes this is hard to use apart from the simple cases. Considering we have already used this technique in the `MethodHandle` implementation, I think there are valid use cases. I don't 100% buy the `MethodHandleImpl` analogy. In that case the check is not simply used to save a branch, but to spare spinning of a completely new lambda form. That is a very heavy operation. What I'm trying to say is that I'm not too sure how robust of a mechanism this is in the context of micro(nano?)-optimizations (such as the one you are considering). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1910359911 From eosterlund at openjdk.org Thu Jan 25 14:53:07 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 Jan 2024 14:53:07 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: References: Message-ID: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Whitespace fix Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17495/files - new: https://git.openjdk.org/jdk/pull/17495/files/82134e63..140a8a1e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17495.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17495/head:pull/17495 PR: https://git.openjdk.org/jdk/pull/17495 From coleenp at openjdk.org Thu Jan 25 15:09:33 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 25 Jan 2024 15:09:33 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v25] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 09:51:59 GMT, Emanuel Peter wrote: >> As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. >> >> I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. >> >> I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks (with implicit no-safepoint-verifiers) at the call-site of those places. >> >> I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. >> >> **Testing** >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > moving code for Coleen Yes, code movement looks good. thank you! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16840#pullrequestreview-1843964725 From amitkumar at openjdk.org Thu Jan 25 15:20:59 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 25 Jan 2024 15:20:59 GMT Subject: RFR: 8315762: Update subtype check profile collection on s390x following 8308869 [v2] In-Reply-To: References: Message-ID: <67cqHtFszZQRHZGhqR7LThot4PciYc1tznqzMTV348s=.1c6acb58-170d-44e8-a9ab-171be82f160a@github.com> > s390x Implementation for https://github.com/openjdk/jdk/pull/14375 > > Benchmark Result with patch: > > Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units > RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1155.409 ? 43.844 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 726.923 ? 54.536 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 676.462 ? 23.503 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 118.650 ? 2.653 ops/us > > > Without Patch: > > Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units > RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1101.248 ? 103.559 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 109.690 ? 3.312 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 110.790 ? 7.927 ops/us > RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 112.244 ? 6.889 ops/us > > > Testing : Fastdebug build + tier1 tests Amit Kumar has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' of https://git.openjdk.org/jdk into subtype_v0 - s390 Port ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17461/files - new: https://git.openjdk.org/jdk/pull/17461/files/6541690b..558553ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17461&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17461&range=00-01 Stats: 8939 lines in 359 files changed: 5568 ins; 2025 del; 1346 mod Patch: https://git.openjdk.org/jdk/pull/17461.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17461/head:pull/17461 PR: https://git.openjdk.org/jdk/pull/17461 From coleenp at openjdk.org Thu Jan 25 15:22:29 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 25 Jan 2024 15:22:29 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v13] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> Message-ID: On Thu, 25 Jan 2024 13:42:20 GMT, Roman Kennke wrote: >> There is no clear satisfying solution for this. Either multiple function names `lightweight_{unlock,lock}_with_thread` , using an extra bool argument to signal that the thread is loaded, or overload the type one with `Register` the other with `Register*`. >> >> I tried something like a00f2e9e7f9b4d1abdcd5931ff8ba62c1d2de868 and 8eaa53e5cd9d3b17d16516af599f451ac4531c8b > > Ah, nevermind. Leave it with extra argument for thread, then. I sort of liked the extra thread parameter so that the callers know that !LP64 needs get_thread() and not the lightweight_{un}lock. It's unfortunately inconsistent and requires an extra register, but nice that you can call it 'thread', at least until you overwrite it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466530655 From coleenp at openjdk.org Thu Jan 25 15:30:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 25 Jan 2024 15:30:34 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v10] In-Reply-To: <0QSuwvim92k8hKdnbAY4AjaJC4DQGK5hRnTYJqSLGWM=.db600a00-bad6-41a7-b2ee-ee6754364434@github.com> References: <0QSuwvim92k8hKdnbAY4AjaJC4DQGK5hRnTYJqSLGWM=.db600a00-bad6-41a7-b2ee-ee6754364434@github.com> Message-ID: On Thu, 25 Jan 2024 11:50:48 GMT, Axel Boldt-Christmas wrote: >> Implements the aarch64 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Drop memory order comments src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 278: > 276: eor(t, mark, markWord::unlocked_value); > 277: cmpxchg(/*addr*/ obj, /*expected*/ mark, /*new*/ t, Assembler::xword, > 278: /*acquire*/ true, /*release*/ false, /*weak*/ false, noreg); Sorry now I understand why the comment. I thought a CAS is a CAS but it's describing the other cmpxchg parameters which I glossed right over. Maybe it is a useful comment after all. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466541852 From coleenp at openjdk.org Thu Jan 25 15:30:36 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 25 Jan 2024 15:30:36 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v9] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 08:01:13 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 455: >> >>> 453: // The owner may be anonymous and we removed the last obj entry in >>> 454: // the lock-stack. This loses the information about the owner. >>> 455: // Write the thread to the owner field so the runtime knows the owner. >> >> Is this necessary here also? Previous checks and slow path code in the runtime has already set the owner, if I understand correctly. > > After popping the last oop of the lock stack we do the `tbnz(mark, exact_log2(markWord::monitor_value), inflated);` check. If this happen the owner will be anonymous. > > Other solutions would be either: > 1. Push the oop back and jump to the runtime. (Would make C2 anonymous owner agnostic). > 2. Fix the owner only in this control flow, not in every inflated slow path exit. > > The first seems alright as well. It is more like what x86 evolved into doing (where it elides this specific check). > Both solutions make the inflated unlock cleaner removes a branch (can branch directly to the slow path). > The second does seems does create a more complex entry to the inflated unlock, does not seem worth it. I didn't see that this isn't like the x86 code that pushes the object back on the lock stack. This seems fine and the comment is helpful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466536709 From shade at openjdk.org Thu Jan 25 15:36:39 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jan 2024 15:36:39 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v6] In-Reply-To: References: <9SikKzxs8M1JdTLvTB6JTozvpCw2CSziF2koHw0ELAQ=.fb2e7696-6fa6-4a2b-87b8-ec57d4fef05c@github.com> Message-ID: On Thu, 25 Jan 2024 14:48:16 GMT, Maurizio Cimadamore wrote: > I don't 100% buy the `MethodHandleImpl` analogy. In that case the check is not simply used to save a branch, but to spare spinning of a completely new lambda form. Doing this to save a few intructions would not likely to worth the hassle outside the _really performance critical paths_, but even then it might be useful for hot JDK code. On larger examples, you can avoid memory accesses, allocations, etc. by coding up the constant-foldable path that you know compiler would not be able to extract when propagating constants through the generic code. For example, giving quantitative substance to my previous example: diff --git a/src/java.base/share/classes/java/lang/Integer.java b/src/java.base/share/classes/java/lang/Integer.java index 1c5b3c414ba..d50748c369e 100644 --- a/src/java.base/share/classes/java/lang/Integer.java +++ b/src/java.base/share/classes/java/lang/Integer.java @@ -28,4 +28,5 @@ import jdk.internal.misc.CDS; import jdk.internal.misc.VM; +import jdk.internal.vm.ConstantSupport; import jdk.internal.vm.annotation.ForceInline; import jdk.internal.vm.annotation.IntrinsicCandidate; @@ -416,4 +417,7 @@ private static void formatUnsignedIntUTF16(int val, int shift, byte[] buf, int l } + @Stable + static final String[] TO_STRINGS = { "-1", "0", "1" }; + /** * Returns a {@code String} object representing the @@ -428,4 +432,8 @@ private static void formatUnsignedIntUTF16(int val, int shift, byte[] buf, int l @IntrinsicCandidate public static String toString(int i) { + if (ConstantSupport.isCompileConstant(i) && + (i >= -1) && (i <= 1)) { + return TO_STRINGS[i + 1]; + } int size = stringSize(i); if (COMPACT_STRINGS) { diff --git a/test/micro/org/openjdk/bench/java/lang/Integers.java b/test/micro/org/openjdk/bench/java/lang/Integers.java index 43ceb5d18d2..28248593a73 100644 --- a/test/micro/org/openjdk/bench/java/lang/Integers.java +++ b/test/micro/org/openjdk/bench/java/lang/Integers.java @@ -91,4 +91,18 @@ public void decode(Blackhole bh) { } + @Benchmark + @OutputTimeUnit(TimeUnit.NANOSECONDS) + public String toStringConstYay() { + return Integer.toString(0); + } + + int v = 0; + + @Benchmark + @OutputTimeUnit(TimeUnit.NANOSECONDS) + public String toStringConstNope() { + return Integer.toString(v); + } + /** Performs toString on small values, just a couple of digits. */ @Benchmark Benchmark (size) Mode Cnt Score Error Units Integers.toStringConstNope 500 avgt 15 3,599 ? 0,034 ns/op Integers.toStringConstNope:gc.alloc.rate.norm 500 avgt 15 48,000 ? 0,001 B/op Integers.toStringConstNope:gc.time 500 avgt 15 223,000 ms Integers.toStringConstYay 500 avgt 15 0,568 ? 0,046 ns/op Integers.toStringConstYay:gc.alloc.rate.norm 500 avgt 15 ? 10?? B/op Think about it as simplifying/avoiding the need for full compiler intrinsics. I could, in principle, do this by intrinsifying `Integer.toString` completely, check the same `isCon`, and then either construct the access to some String constant, or arrange the call to actual toString slow path. That would not be as simple as doing the similar thing in plain Java, with just a little of compiler support in form of `ConstantSupport`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1910449450 From epeter at openjdk.org Thu Jan 25 15:52:50 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 25 Jan 2024 15:52:50 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v4] In-Reply-To: <6CTbUBPmPjgvR_Rk6lQbJhvCuOgEqAVIHUPr-QLCx1c=.a470f8d8-51f2-4d3f-b080-cc30a7c8e70b@github.com> References: <6CTbUBPmPjgvR_Rk6lQbJhvCuOgEqAVIHUPr-QLCx1c=.a470f8d8-51f2-4d3f-b080-cc30a7c8e70b@github.com> Message-ID: <2cFsAlJYOqbSzOkzjg8zXGhmkobNA8waVzUkc7j_1hw=.3e37080d-447b-45aa-a7a1-b0acd328732a@github.com> On Fri, 1 Dec 2023 08:46:33 GMT, Tom Rodriguez wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - manual merge with master after JDK-8267532 >> - more locking, still fails tho - WIP >> - adding more verification and more locking, WIP >> - add locks for jvmci calls to allocate_bci_to_data >> - 8306767 > > Sounds reasonable to me. Thanks @tkrodriguez @fisk @rwestrel @coleenp for all your help, conversations and suggestions! One more potentially highly intermittent bug fixed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1910481519 From epeter at openjdk.org Thu Jan 25 15:52:52 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 25 Jan 2024 15:52:52 GMT Subject: Integrated: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe In-Reply-To: References: Message-ID: <9Z46x14U6rAfYXcE8WI6U-70Q3CCQWa752T4UBlnOjs=.4edee5b0-2d96-40a7-8f35-62eac1993191@github.com> On Tue, 28 Nov 2023 06:23:29 GMT, Emanuel Peter wrote: > As explained in a [comment below](https://github.com/openjdk/jdk/pull/16840#issuecomment-1833529561), we have to ensure that reading/writing/cleaning the extra data all needs to be guarded by the `extra_data_lock`, and that no safepoint should happen while holding that lock, so that the lock is not broken. > > I introduced `check_extra_data_locked`, where I check that we hold the lock, and if we are a java thread (only those ever safepoint), that we currently are in a `NoSafepointVerifier` scope, hence we verify that no safepoint will be taken. > > I placed `check_extra_data_locked` in all the places where we access the extra data, and then placed locks (with implicit no-safepoint-verifiers) at the call-site of those places. > > I also needed to change the rank of `extra_data_lock` to `nosafepoint` and set the `Mutex::_no_safepoint_check_flag` when taking the lock. Otherwise I could not take the lock from a VM thread. > > **Testing** > Testing: tier1-3 and stress. This pull request has now been integrated. Changeset: 746a0868 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/746a08686bfad629fe045a762ed2fbb209763f6b Stats: 177 lines in 13 files changed: 125 ins; 24 del; 28 mod 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe Reviewed-by: eosterlund, roland, coleenp, never ------------- PR: https://git.openjdk.org/jdk/pull/16840 From lmesnik at openjdk.org Thu Jan 25 17:00:38 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 25 Jan 2024 17:00:38 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: <-quimoVBzziosvz8NKyrp0fop7T9ZRDg4SJo1wif6aw=.77948f6f-0728-4e57-ae98-5c5472f435ee@github.com> References: <-quimoVBzziosvz8NKyrp0fop7T9ZRDg4SJo1wif6aw=.77948f6f-0728-4e57-ae98-5c5472f435ee@github.com> Message-ID: On Wed, 24 Jan 2024 21:28:29 GMT, Leonid Mesnik wrote: >> Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. >> >> I provisionally call this flag `external-dep`, but I am open for other suggestions. >> >> Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. >> >> Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. >> >> Additional testing: >> - [x] `make test TEST=applications/` fails >> - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests > > Marked as reviewed by lmesnik (Reviewer). > @lmesnik, you good with the keyword name? Yes, I'm fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1910613276 From shade at openjdk.org Thu Jan 25 18:05:39 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jan 2024 18:05:39 GMT Subject: RFR: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 10:48:23 GMT, Aleksey Shipilev wrote: > Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. > > I provisionally call this flag `external-dep`, but I am open for other suggestions. > > Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. > > Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. > > Additional testing: > - [x] `make test TEST=applications/` fails > - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests Awesome, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17421#issuecomment-1910723514 From shade at openjdk.org Thu Jan 25 18:05:40 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jan 2024 18:05:40 GMT Subject: Integrated: 8323717: Introduce test keyword for tests that need external dependencies In-Reply-To: References: Message-ID: <7-XWGMcg4_0LkPgaUwPBolYkQe0Wg33bsDvNec2zdRo=.796bd2b0-bdd2-4e35-af38-d0ea9578ec43@github.com> On Mon, 15 Jan 2024 10:48:23 GMT, Aleksey Shipilev wrote: > Some jtreg tests require resolvable external dependencies. This resolution is delegated to JIB, which is not used in vanilla OpenJDK testing. It would be convenient to add a keyword that marks tests that require these external dependencies, so that we could exclude those tests from runs. This would allow us to: a) run all tests in hotspot:tier4, which now excludes `applications/` specifically; b) make all tests runs (#17422) cleaner on many environments. > > I provisionally call this flag `external-dep`, but I am open for other suggestions. > > Note that some tests that pull `@Artifact`-s provide special paths that do limited testing anyway. However, there are tests which cannot run without external dependencies at all. These include at least `applications/jcstress` and `applications/scimark` tests. > > Ironically, I cannot run the jcstress test generator because the dependencies are lacking here. I regenerated those test using a self-built jcstress 0.16 bundle. > > Additional testing: > - [x] `make test TEST=applications/` fails > - [x] `JTREG_KEYWORDS=!external-dep make test TEST=applications/` passes, skipping most of the tests This pull request has now been integrated. Changeset: 12b89cd2 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/12b89cd2eeb5c2c43a2ce425c96fc4f718e30514 Stats: 62 lines in 32 files changed: 32 ins; 0 del; 30 mod 8323717: Introduce test keyword for tests that need external dependencies Reviewed-by: dholmes, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/17421 From kbarrett at openjdk.org Thu Jan 25 18:39:00 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 18:39:00 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle [v3] In-Reply-To: References: Message-ID: > Please review this change to the lazy initialization of the MemoryManager > object and the associated MemoryPool objects. > > They previously used an atomic access to the respective OopHandle member > holding the associated Java object as the is-initialized sentinal, testing > whether the handle was empty or had an associated OopStorage entry. When > empty, initialization was performed using a lock to prevent races. > > Now they use a separate atomic is-initialized flag as the sentinal. > > As a result, the support for atomic access to an OopHandle's underlying handle > (via a translator) is no longer needed and is removed. > > While there, I moved the allocation of the associated OopStorage entries out > from under the Management_lock. > > Testing: mach5 tier1 > > A couple of notes for reviewers. > > Once initialized with a Java object recorded in the associated OopHandle, the > OopHandle and the value recorded therein is never changed. > > The old is-initialized check makes use of OopHandle::resolve returning null if > either the handle is empty (has no OopStorage entry yet) or the OopStorage > entry contains null. The latter never happens in this case. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into avoid-atomic-ptr-raw - aboldtch review - remove unused OopHandle translator - MemoryPool doesn't use atomics on OopHandle - MemoryManager doesn't use atomics on OopHandle ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17533/files - new: https://git.openjdk.org/jdk/pull/17533/files/e9b455aa..21b968e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17533&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17533&range=01-02 Stats: 2305 lines in 141 files changed: 1612 ins; 437 del; 256 mod Patch: https://git.openjdk.org/jdk/pull/17533.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17533/head:pull/17533 PR: https://git.openjdk.org/jdk/pull/17533 From kbarrett at openjdk.org Thu Jan 25 18:39:00 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 18:39:00 GMT Subject: RFR: 8324492: Remove Atomic support for OopHandle [v3] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 10:41:26 GMT, Axel Boldt-Christmas wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'master' into avoid-atomic-ptr-raw >> - aboldtch review >> - remove unused OopHandle translator >> - MemoryPool doesn't use atomics on OopHandle >> - MemoryManager doesn't use atomics on OopHandle > > Marked as reviewed by aboldtch (Reviewer). Thanks for reviews, @xmas92 and @coleenp . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17533#issuecomment-1910769663 From kbarrett at openjdk.org Thu Jan 25 18:39:00 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 25 Jan 2024 18:39:00 GMT Subject: Integrated: 8324492: Remove Atomic support for OopHandle In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 10:52:06 GMT, Kim Barrett wrote: > Please review this change to the lazy initialization of the MemoryManager > object and the associated MemoryPool objects. > > They previously used an atomic access to the respective OopHandle member > holding the associated Java object as the is-initialized sentinal, testing > whether the handle was empty or had an associated OopStorage entry. When > empty, initialization was performed using a lock to prevent races. > > Now they use a separate atomic is-initialized flag as the sentinal. > > As a result, the support for atomic access to an OopHandle's underlying handle > (via a translator) is no longer needed and is removed. > > While there, I moved the allocation of the associated OopStorage entries out > from under the Management_lock. > > Testing: mach5 tier1 > > A couple of notes for reviewers. > > Once initialized with a Java object recorded in the associated OopHandle, the > OopHandle and the value recorded therein is never changed. > > The old is-initialized check makes use of OopHandle::resolve returning null if > either the handle is empty (has no OopStorage entry yet) or the OopStorage > entry contains null. The latter never happens in this case. This pull request has now been integrated. Changeset: 39b756a0 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/39b756a0d163d60d1b69fbc9bf6e8235080c3721 Stats: 101 lines in 5 files changed: 24 ins; 14 del; 63 mod 8324492: Remove Atomic support for OopHandle Reviewed-by: aboldtch, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/17533 From kevinw at openjdk.org Thu Jan 25 21:38:59 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 25 Jan 2024 21:38:59 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned Message-ID: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. ------------- Commit messages: - (C) - Check for null to avoid handshake or safepoint check - 8314225: SIGSEGV in JavaThread::is_lock_owned Changes: https://git.openjdk.org/jdk/pull/17566/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17566&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8314225 Stats: 38 lines in 3 files changed: 33 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17566.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17566/head:pull/17566 PR: https://git.openjdk.org/jdk/pull/17566 From kevinw at openjdk.org Thu Jan 25 21:38:59 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 25 Jan 2024 21:38:59 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: <4auubouBLHomqlbzdEVEHe8HdUhWYOnyP0pEuFgtOaA=.8e74a31e-5482-4e54-90df-8ab9f9d77d7a@github.com> On Thu, 25 Jan 2024 11:04:03 GMT, Kevin Walls wrote: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. There are only a couple of callers to monitor_chunks() which are not deoptimization itself. My current change for this is for the non-deopt users to call a new monitor_chunks_safe() method, which actually asserts that it is returning nullptr. If at a safepoint, it should be nullptr as deoptimization is not running. If not at a safepoint, it handshakes the target thread to retrieve the value. This lets deoptimization complete, and the value should be nullptr. This change is to build confidence that _monitor_chunks is always null when observed outside of deoptimization. If that proves OK, the non-deopt related callers of monitor_chunks could be removed in future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1909937973 From heidinga at openjdk.org Thu Jan 25 21:38:59 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Thu, 25 Jan 2024 21:38:59 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Thu, 25 Jan 2024 11:04:03 GMT, Kevin Walls wrote: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. src/hotspot/share/runtime/javaThread.cpp line 1043: > 1041: chunks = _monitor_chunks; // will be null, as deopt frees when finished > 1042: } else { > 1043: ReadMonitorChunksHandshake rmch; How expensive is the Handshake versus the direct read? Is it worth optimistically reading using `monitor_chunks()` and only attempt the handshake if it returns non-null? Or is there an API to probe if a thread is in a deopt that we can wrap around the handshake? If the handshake is cheap enough, then this isn't worth looking at further. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1466401816 From kevinw at openjdk.org Thu Jan 25 21:38:59 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 25 Jan 2024 21:38:59 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: <8-fnVkNBc6Vyy2z2o9iam88k2DSpjQ_v56C6mNXIfDE=.aa2feb5c-e350-448a-8055-9b37c0668fe2@github.com> On Thu, 25 Jan 2024 13:45:48 GMT, Dan Heidinga wrote: >> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. >> When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. >> >> There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. > > src/hotspot/share/runtime/javaThread.cpp line 1043: > >> 1041: chunks = _monitor_chunks; // will be null, as deopt frees when finished >> 1042: } else { >> 1043: ReadMonitorChunksHandshake rmch; > > How expensive is the Handshake versus the direct read? Is it worth optimistically reading using `monitor_chunks()` and only attempt the handshake if it returns non-null? > > Or is there an API to probe if a thread is in a deopt that we can wrap around the handshake? > > If the handshake is cheap enough, then this isn't worth looking at further. Actually yes I was thinking about checking the value for non-null first, I will try that out. No handshake has to be faster... 8-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1466446130 From kevinw at openjdk.org Thu Jan 25 21:39:00 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 25 Jan 2024 21:39:00 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <8-fnVkNBc6Vyy2z2o9iam88k2DSpjQ_v56C6mNXIfDE=.aa2feb5c-e350-448a-8055-9b37c0668fe2@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <8-fnVkNBc6Vyy2z2o9iam88k2DSpjQ_v56C6mNXIfDE=.aa2feb5c-e350-448a-8055-9b37c0668fe2@github.com> Message-ID: On Thu, 25 Jan 2024 14:20:08 GMT, Kevin Walls wrote: >> src/hotspot/share/runtime/javaThread.cpp line 1043: >> >>> 1041: chunks = _monitor_chunks; // will be null, as deopt frees when finished >>> 1042: } else { >>> 1043: ReadMonitorChunksHandshake rmch; >> >> How expensive is the Handshake versus the direct read? Is it worth optimistically reading using `monitor_chunks()` and only attempt the handshake if it returns non-null? >> >> Or is there an API to probe if a thread is in a deopt that we can wrap around the handshake? >> >> If the handshake is cheap enough, then this isn't worth looking at further. > > Actually yes I was thinking about checking the value for non-null first, I will try that out. No handshake has to be faster... 8-) Updated - if _monitor_chunks is null, avoid the handshake, and the check if we are at a safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1466983462 From dlong at openjdk.org Thu Jan 25 22:12:26 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 25 Jan 2024 22:12:26 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Thu, 25 Jan 2024 11:04:03 GMT, Kevin Walls wrote: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. src/hotspot/share/runtime/javaThread.cpp line 1038: > 1036: // A ThreadLocalHandshake will mean deopt is complete. > 1037: MonitorChunk* JavaThread::monitor_chunks_safe() const { > 1038: MonitorChunk* chunks = _monitor_chunks; There's still a race here, right? The target thread isn't necessarily the same as the current thread, so it could deoptimize immediately after reading `_monitor_chunks`. I think for correctness it would always need to handshake. But then what's the point, if we always return null? Then this just turns into a strange synchronization mechanism. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1467015808 From dlong at openjdk.org Thu Jan 25 22:28:34 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 25 Jan 2024 22:28:34 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Thu, 25 Jan 2024 11:04:03 GMT, Kevin Walls wrote: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. src/hotspot/share/runtime/javaThread.cpp line 1004: > 1002: > 1003: // Consider removing: chunk is always null. > 1004: for (MonitorChunk* chunk = monitor_chunks_safe(); chunk != nullptr; chunk = chunk->next()) { Unfortunately, this is now going to give the wrong answer if a deopt is in progress. I think we need a better solution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1467027532 From dcubed at openjdk.org Thu Jan 25 22:42:41 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 25 Jan 2024 22:42:41 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v13] In-Reply-To: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> References: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> Message-ID: <5oOncmWIUEnwPhmOI1AdX3bsCxJPZY99qZFaWyT1hsM=.88cc63dd-a2c5-40bb-859b-520b3e47c83d@github.com> On Tue, 23 Jan 2024 16:14:53 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Fix miss in is_recursive improvement I last reviewed v08 of this PR. This review is for v12 of the PR. Don't forget to make a pass to update the appropriate copyright years. Again, I only have nits or questions that need some clarifications. src/hotspot/share/runtime/lockStack.hpp line 101: > 99: > 100: // Try recursive enter. > 101: // Precondition: This lock-stack must no be full. Nit typo: s/must no be/must not be/ src/hotspot/share/runtime/lockStack.inline.hpp line 50: > 48: > 49: inline bool LockStack::is_full() const { > 50: return to_index(_top) == CAPACITY; Would it be too paranoid to use `>= CAPACITY`? Or to add an assert that the index is not greater than capacity? src/hotspot/share/runtime/lockStack.inline.hpp line 92: > 90: // lock-stack with a length of at least 2. > 91: > 92: assert(contains(o), "entries must exist"); Perhaps: s/entries must exist/at least one entry must exist/ src/hotspot/share/runtime/lockStack.inline.hpp line 103: > 101: } > 102: if (_base[i] == o) { > 103: // o can only occur in one consecutive run on the lock-stack. I'm not sure that the claim on L103 is always true. If we have a lock stack like this: _base[end - 1] = o1; _base[end - 2] = o2; _base[end - 3] = o1; _base[end - 4] = o1; When our `o == o1` we don't have a recursive run on the top-most part of the lock stack, but we do have one that's lower down. L103 isn't correct in this case, but that doesn't matter because we actually care about whether the top most run is recursive. I think L103 can be deleted and the rest of the comment is okay. src/hotspot/share/runtime/lockStack.inline.hpp line 149: > 147: > 148: int end = to_index(_top); > 149: if (end <= 1 || _base[end - 1] != o || _base[end - 2] != o) { nit extra space: s/ _base[end - 2]/ _base[end - 2]/ src/hotspot/share/runtime/objectMonitor.inline.hpp line 106: > 104: > 105: inline void ObjectMonitor::set_recursions(size_t recursions) { > 106: assert(_recursions == 0, "must be"); Why have the `recursions` parameter if the passed value must always be zero? Update: It looks like you might be trying to detect some out of sync count coming from the removal of the object from the lock stack. You expect it to always be a count value of 1 removal and if more than 1 is removed you want to assert(). src/hotspot/share/runtime/synchronizer.cpp line 566: > 564: markWord mark = obj()->mark_acquire(); > 565: while (mark.is_neutral()) { > 566: // Retry until a lock state change has been observed. cas_set_mark() may collide with non lock bits modifications. nit extra space: s/. cas_set_mark()/. cas_set_mark()/ src/hotspot/share/runtime/synchronizer.cpp line 642: > 640: } else { > 641: while (mark.is_fast_locked()) { > 642: // Retry until a lock state change has been observed. cas_set_mark() may collide with non lock bits modifications. nit extra space: s/. cas_set_mark()/. cas_set_mark()/ ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16606#pullrequestreview-1844752233 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467005692 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467008430 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467011290 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467019715 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467021608 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467024242 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467028121 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467029683 From dlong at openjdk.org Thu Jan 25 22:44:35 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 25 Jan 2024 22:44:35 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: <4v0nymHSknoRukr12-29tvErGdmX1O-IipynvQRDmC4=.c5141038-70f8-4218-af9c-f9b57b03d536@github.com> On Thu, 25 Jan 2024 11:04:03 GMT, Kevin Walls wrote: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. src/hotspot/share/jfr/leakprofiler/checkpoint/rootResolver.cpp line 253: > 251: // Traverse the monitor chunks > 252: // Consider removing, chunk will always be null. > 253: MonitorChunk* chunk = jt->monitor_chunks_safe(); This seems like a good candidate for removal, to be replaced by an assert. If we can walk the stack below, then either "jt" is the current thread, or we are at a safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1467039360 From dcubed at openjdk.org Thu Jan 25 22:51:37 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 25 Jan 2024 22:51:37 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v13] In-Reply-To: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> References: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> Message-ID: <731NqDu5tfSWZWic86kaLx0_1Z-Az_PZDXsVdlQ0psU=.ab0123f6-ccc1-4ef7-98ad-73aaff6c88be@github.com> On Tue, 23 Jan 2024 16:14:53 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Fix miss in is_recursive improvement This change is looking really good. This time I did my crawl thru using the webrev which is likely why I found some things that I missed in previous reviews. Sorry about that. I'm going to start doing crawl thru reviews of the X64 and aarch64 patches also. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16606#issuecomment-1911120611 From duke at openjdk.org Fri Jan 26 02:53:33 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 26 Jan 2024 02:53:33 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v29] In-Reply-To: <3MHoDMkp-_AHrKb6z9fEMjP1RbiHleGBguNxXKu9_kw=.22fbb7ba-8fc7-4494-b52d-6ae7936a1eda@github.com> References: <3MHoDMkp-_AHrKb6z9fEMjP1RbiHleGBguNxXKu9_kw=.22fbb7ba-8fc7-4494-b52d-6ae7936a1eda@github.com> Message-ID: On Thu, 25 Jan 2024 08:22:26 GMT, Thomas Stuefe wrote: >> Liming Liu has updated the pull request incrementally with two additional commits since the last revision: >> >> - Use TestThreadGroup >> - Set it as default before parsing > > src/hotspot/os/linux/os_linux.cpp line 2972: > >> 2970: ", %d) failed; error='%s' (errno=%d)", >> 2971: p2i(first), len, MADV_POPULATE_WRITE, >> 2972: os::strerror(err), err); > > What other things can go wrong here beside missing kernel support? > > Unconditional log output (with log_warning) is tricky. Many tools parse the JVM output and are thrown off by unexpected content. That's why we restrict log_warning to the small band of "stuff that can go wrong at a customer but it is so severe we really need to tell the customer right now". > > Stuff that should never go wrong should be assert()ed, or possibly guarantee()'d. > > Stuff that can go wrong but is not as severe, should be warned about at a lower level. > > In this case, output may get flooded with warnings if we continue running the VM and repeat the pretouch attempts with other areas. There are some cases shown in madvise_populate at linux/mm/madvise.c. But I'm not sure how to trigger them. So I would decrease the log level here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1467176397 From duke at openjdk.org Fri Jan 26 03:07:02 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 26 Jan 2024 03:07:02 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v30] In-Reply-To: References: Message-ID: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Make it true by default and use a lower log level when fail ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/55946581..3ac920fd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=29 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=28-29 Stats: 5 lines in 2 files changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From duke at openjdk.org Fri Jan 26 05:43:00 2024 From: duke at openjdk.org (kuaiwei) Date: Fri, 26 Jan 2024 05:43:00 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v4] In-Reply-To: References: Message-ID: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s kuaiwei has updated the pull request incrementally with one additional commit since the last revision: change AlwaysMergeDMB diagnostic option ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17511/files - new: https://git.openjdk.org/jdk/pull/17511/files/056c1859..b842951d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17511&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17511.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17511/head:pull/17511 PR: https://git.openjdk.org/jdk/pull/17511 From mbaesken at openjdk.org Fri Jan 26 07:58:48 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 26 Jan 2024 07:58:48 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR Message-ID: Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. PhysicalMemory could be enhanced or a new event added. There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. ------------- Commit messages: - JDK-8324287 Changes: https://git.openjdk.org/jdk/pull/17581/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324287 Stats: 161 lines in 11 files changed: 160 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From fyang at openjdk.org Fri Jan 26 08:10:37 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 26 Jan 2024 08:10:37 GMT Subject: RFR: 8324186: AARCH64: Use "dmb.ishst+dmb.ishld" for release barrier [v4] In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 05:43:00 GMT, kuaiwei wrote: >> Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. >> Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java >> Run with ParallelGC to minimalize impact of gc barrier. >> >> make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" >> ... >> FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s >> >> Without the patch >> >> FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s > > kuaiwei has updated the pull request incrementally with one additional commit since the last revision: > > change AlwaysMergeDMB diagnostic option LGTM. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17511#pullrequestreview-1845236858 From stuefe at openjdk.org Fri Jan 26 08:23:40 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 26 Jan 2024 08:23:40 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v9] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <35wH1ZJVECbArnyk5QmOW1xK66UqhL2478hxCxvIY8E=.7db8c7d3-6979-49ee-9ebd-a6f1b2a65531@github.com> On Thu, 25 Jan 2024 11:04:03 GMT, Suchismith Roy wrote: > > For me the unresolved question is still: > > > > * do we want an unconditional load of *.a for a given *.so (have yet to see any documentation for this a-file duality) > > Yes. The documentation link - https://www.ibm.com/docs/en/aix/7.3?topic=memory-shared-objects-run-time-linking The text **In dynamic mode, input files specified with the -l flag may end in .so, as well as in .a. That is, a reference to -lfoo is satisfied by the first libfoo.so or libfoo.a found in any of the directories being searched. Dynamic mode is in effect by default unless the -bstatic option is used.** > > https://www.ibm.com/docs/en/aix/7.3?topic=l-ld-command > > Archive files are composite objects, which usually contain import files and object files, including shared objects. If an archive file contains another archive file or a member whose type is not recognized, the ld command issues a warning and ignores the unrecognized member. If an object file contained in an archive file has the F_LOADONLY bit set in the XCOFF header, the ld command ignores the member. This bit is usually used to designate old versions of shared objects that remain in the archive file to allow existing applications to load and run. New applications link with the new version of the shared object, that is, another member of the archive. Excellent, thank you. > > > * if we do, do we want that to be bidirectional? Someone specifies *.a, do we want to attempt to load *.so? > > Considering the different scenarios, loading .a after .so failure should suffice. I got a chance to look at the right file in OpenJ9-omr ,which has a native code which does an attempt to load archive files after trying to load .so files. This code was always there and it explains why the issue did not occur in Semeru, which is derived from this repository. Okay. We don't have to be better than J9 then. If they do it, we should too. So, for the following input, we do: "library.so" -> load "library.so", then "library.a" "library" -> load "library.so", then "library.a" ? "library.a" -> only load "library.a" ? (*) > > > When in doubt, we should just mimic what OpenJ9 is doing on AIX. But I would like a clear documentation as a comment in os_aix.cpp explaining the logic and referencing the relevant OpenJ9 files. > > Any example comment you can refer ? I mean i just mention the file name in OpenJ9 and explain the logic ? Let me know for any further clarifications Just reference the excerpts you mentioned above, then describe your intended logic. Example: "When loading .so, upon failure we attempt to load . When loading a library given without extension, ..." Explaining the logic makes it easy to see for the casual code reader what your intent is, that you have thought of all cases (*), and makes it possible to check the coding against your intent. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1911652995 From stuefe at openjdk.org Fri Jan 26 08:24:42 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 26 Jan 2024 08:24:42 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v30] In-Reply-To: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> References: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> Message-ID: <9j02nKW8CUJBwHLNwjm6TNAQ46UvenuWz2PUV1tbXcw=.984b047d-1add-40ab-aa54-630ffe0d307a@github.com> On Fri, 26 Jan 2024 03:07:02 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Make it true by default and use a lower log level when fail Looks good to me (but you need at least two reviewers here). Thanks for your perseverance. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15781#pullrequestreview-1845254252 From aboldtch at openjdk.org Fri Jan 26 08:56:54 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 08:56:54 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v14] In-Reply-To: References: Message-ID: > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request incrementally with three additional commits since the last revision: - Add verify calls - Assert valid lock stack offset - Typos, wording and whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16606/files - new: https://git.openjdk.org/jdk/pull/16606/files/ae2bfca3..1084381e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=12-13 Stats: 16 lines in 3 files changed: 11 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From aboldtch at openjdk.org Fri Jan 26 08:56:54 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 08:56:54 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v13] In-Reply-To: <5oOncmWIUEnwPhmOI1AdX3bsCxJPZY99qZFaWyT1hsM=.88cc63dd-a2c5-40bb-859b-520b3e47c83d@github.com> References: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> <5oOncmWIUEnwPhmOI1AdX3bsCxJPZY99qZFaWyT1hsM=.88cc63dd-a2c5-40bb-859b-520b3e47c83d@github.com> Message-ID: On Thu, 25 Jan 2024 22:01:16 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix miss in is_recursive improvement > > src/hotspot/share/runtime/lockStack.inline.hpp line 50: > >> 48: >> 49: inline bool LockStack::is_full() const { >> 50: return to_index(_top) == CAPACITY; > > Would it be too paranoid to use `>= CAPACITY`? > Or to add an assert that the index is not greater than capacity? Added an assert to `to_index` that the offset is valid, also added verify calls to the `_recursive` functions. > src/hotspot/share/runtime/lockStack.inline.hpp line 103: > >> 101: } >> 102: if (_base[i] == o) { >> 103: // o can only occur in one consecutive run on the lock-stack. > > I'm not sure that the claim on L103 is always true. If we have a lock stack like this: > > > _base[end - 1] = o1; > _base[end - 2] = o2; > _base[end - 3] = o1; > _base[end - 4] = o1; > > > When our `o == o1` we don't have a recursive run on the top-most part of > the lock stack, but we do have one that's lower down. L103 isn't correct in > this case, but that doesn't matter because we actually care about whether > the top most run is recursive. I think L103 can be deleted and the rest of > the comment is okay. The algorithm always inflates interleaved recursive locking. That lock stack would not occur. The last enter on `o1` would see that `o1` is fast locked, but `try_recursive_enter` would fail (`o1` not top of lock stack) and inflate `o1` which removes all `o1` entiers of the lock-stack. > src/hotspot/share/runtime/objectMonitor.inline.hpp line 106: > >> 104: >> 105: inline void ObjectMonitor::set_recursions(size_t recursions) { >> 106: assert(_recursions == 0, "must be"); > > Why have the `recursions` parameter if the passed value must always be zero? > > Update: It looks like you might be trying to detect some out of sync count coming > from the removal of the object from the lock stack. You expect it to always be a > count value of 1 removal and if more than 1 is removed you want to assert(). The assert is on the pre-value of `ObjectMonitor::_recursions`. The parameter can have any value. Maybe the assert is out of place. >From the recursive lightweight's perspective `ObjectMonitor::set_recursions` is part of the ObjectMonitors initialisation. Some other thread has created the ObjectMonitor for the locking thread, but filled in the placeholder values anonymous owner and set recursions to 0. The locking thread will at some point notice this and finish initialising the ObjectMonitor by setting the _owner and _recursions fields. There are also assumptions in the C2 code that the _recursions field when inflated by a non owning thread is set to 0. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467370202 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467370523 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467370846 From aboldtch at openjdk.org Fri Jan 26 09:21:00 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 09:21:00 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v15] In-Reply-To: References: Message-ID: > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: - Update copyright - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319797 - Add verify calls - Assert valid lock stack offset - Typos, wording and whitespace - Fix miss in is_recursive improvement - Added comment about the rational behind full lock stack inflation. May need rewording - Add logging when lock stack capacity is exceeded. - Remove inaccurate comment - Correct nomenclature balanced vs structured. - ... and 40 more: https://git.openjdk.org/jdk/compare/c313d451...8df7f441 ------------- Changes: https://git.openjdk.org/jdk/pull/16606/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=14 Stats: 866 lines in 13 files changed: 817 ins; 7 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From duke at openjdk.org Fri Jan 26 09:22:47 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 26 Jan 2024 09:22:47 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v30] In-Reply-To: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> References: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> Message-ID: On Fri, 26 Jan 2024 03:07:02 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Make it true by default and use a lower log level when fail This PR hasn't been set to require two reviewers yet. Should I set it now? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1911726221 From aboldtch at openjdk.org Fri Jan 26 09:22:48 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 09:22:48 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: References: Message-ID: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Update variable names in ad files - Preload markWord unconditionally - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Add more expressive stub continuation names - Remove outdated anonymous owner fix in stub - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Remove C2HandleAnonOMOwnerStub definitions on x86. - Add MFENCE comment - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - ... and 11 more: https://git.openjdk.org/jdk/compare/7efd4d21...4d37c4b7 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/6de1d69b..4d37c4b7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=12-13 Stats: 2574 lines in 165 files changed: 1801 ins; 427 del; 346 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From aboldtch at openjdk.org Fri Jan 26 09:24:00 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 09:24:00 GMT Subject: RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v11] In-Reply-To: References: Message-ID: > Implements the aarch64 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Drop memory order comments - Preloads markWord unconditionally - Revert "Add preload_mark to MacroAssembler::lightweight_lock" This reverts commit 8950f503aa5dba0e203613bd9737ea0d50388ca3. - Add preload_mark to MacroAssembler::lightweight_lock - Rename box to t1 - Remove third tmp from fast_lock - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319801 - Switch to CAS over LXSX - Fix missing $ - ... and 7 more: https://git.openjdk.org/jdk/compare/17554dbf...d0a02754 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16608/files - new: https://git.openjdk.org/jdk/pull/16608/files/e4d5dcd7..d0a02754 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16608&range=09-10 Stats: 2574 lines in 165 files changed: 1801 ins; 427 del; 346 mod Patch: https://git.openjdk.org/jdk/pull/16608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16608/head:pull/16608 PR: https://git.openjdk.org/jdk/pull/16608 From mdoerr at openjdk.org Fri Jan 26 09:30:33 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 26 Jan 2024 09:30:33 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v9] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <4nhlEYb7mye1qtUk_wptqHTED6Lw7jsgA48sTpWWbXs=.cd2aef1d-9f7f-499e-807f-ed294aee0f40@github.com> On Tue, 16 Jan 2024 08:36:49 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with three additional commits since the last revision: > > - Update porting_aix.cpp > - Update porting_aix.cpp > - Update os_aix.cpp I think a comment like this would be ideal: Search order: "library.so" -> load "library.so", then "library.a" "library" -> load "library.so", then "library.a" ? "library.a" -> only load "library.a" ? (If that's correct.) I prefer that over a very verbose version. + References to Spec and J9 of course. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1911737129 From jsjolen at openjdk.org Fri Jan 26 09:33:46 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Jan 2024 09:33:46 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v30] In-Reply-To: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> References: <9d5UWcRhfhgqpUkvy2dv77bATgCKYFjxNTDreBfk4MI=.5682e46d-b448-4936-8e98-14549669d3dc@github.com> Message-ID: On Fri, 26 Jan 2024 03:07:02 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Make it true by default and use a lower log level when fail Approved! @limingliu-ampere, you don't need to change it. HotSpot reviews always require 2 reviewers but we never set the option. Thank you for your hard work. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15781#pullrequestreview-1845360471 From bulasevich at openjdk.org Fri Jan 26 10:04:47 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 26 Jan 2024 10:04:47 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v5] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: minor update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/9bd9da95..ea39f93d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From bulasevich at openjdk.org Fri Jan 26 10:04:51 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 26 Jan 2024 10:04:51 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Thu, 18 Jan 2024 15:02:11 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> cleanup & test udpdate > > src/hotspot/share/code/codeCache.cpp line 181: > >> 179: GrowableArray* CodeCache::_allocable_heaps = new(mtCode) GrowableArray (static_cast(CodeBlobType::All), mtCode); >> 180: >> 181: void CodeCache::report_cache_minimal_size_error(const char *codeheap, size_t size, size_t required_size) { > > I suggest to have a function: > > static void check_min_size(... code_heap, size_t min_required_size) { > if (code_heap.enabled && code_heap.size >= min_required_size) > return; > > log_debug(codecache)("Code heap (%s) size " SIZE_FORMAT " below required minimal size " SIZE_FORMAT, > code_heap.name, code_heap.size, min_required_size); > > err_msg title("Not enough space in %s to run VM", code_heap.name); > err_msg message(SIZE_FORMAT "K < " SIZE_FORMAT "K", code_heap.size / K, min_required_size / K); > vm_exit_during_initialization(title, message); > } ok > src/hotspot/share/code/codeCache.cpp line 185: > >> 183: codeheap, (long long) size, (long long) required_size); >> 184: err_msg title("Not enough space in %s to run VM", codeheap); >> 185: err_msg message(SIZE_FORMAT "K < " SIZE_FORMAT "K", size, required_size); > > Missed `/ K` thanks! > src/hotspot/share/code/codeCache.cpp line 232: > >> 230: // segment size ever if it was set explicitly. >> 231: non_profiled.size += profiled.size; >> 232: // Profiled segment is not available, forcibly set size to 0 > > Profiled code heap is not available, ... ok ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1467444024 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1467443324 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1467443079 From bulasevich at openjdk.org Fri Jan 26 10:04:52 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 26 Jan 2024 10:04:52 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v4] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_vL8VcUX5WtIwspeQ6xYZZb_Bwe1U9J9Agxk4kb6oaU=.9c6f2cd6-6964-4d82-83c3-1ca5a8039ea7@github.com> Message-ID: On Mon, 22 Jan 2024 16:10:28 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> apply suggestions > > src/hotspot/share/code/codeCache.cpp line 186: > >> 184: } else { >> 185: log_debug(codecache)("CodeCache minimum size fail for %s %lld vs %lld", >> 186: codeheap, (long long) size, (long long) required_size); > > log_debug(codecache)("Code heap (%s) size " SIZE_FORMAT " below required minimal size " SIZE_FORMAT, > code_heap, size, required_size); ok ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1467444381 From bulasevich at openjdk.org Fri Jan 26 10:04:53 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 26 Jan 2024 10:04:53 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v2] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Tue, 16 Jan 2024 18:36:22 GMT, Evgeny Astigeevich wrote: >> src/hotspot/share/code/codeCache.hpp line 469: >> >>> 467: typedef CodeBlobIterator AllCodeBlobsIterator; >>> 468: >>> 469: struct CodeCacheSegment { >> >> Would this create a confusion? >> `CodeHeap` consists of blocks and names them `segments`. See `src/hotspot/share/memory/heap.hpp` and `src/hotspot/share/memory/heap.cpp`. >> There is `CodeCache::allocated_segments()` which returns the total number of segments (memory blocks) code heaps use. Also there is `CodeCacheSegmentSize` which defines the size of a memory block for a code heap. > > Maybe `CodeHeapInfo` would be better? ok ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1467442944 From tschatzl at openjdk.org Fri Jan 26 10:41:27 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 26 Jan 2024 10:41:27 GMT Subject: RFR: 8324301: Obsolete MaxGCMinorPauseMillis In-Reply-To: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> References: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> Message-ID: On Mon, 22 Jan 2024 11:33:14 GMT, Albert Mingkun Yang wrote: > Simple obsoleting a deprecated jvm flag. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17517#pullrequestreview-1845476173 From ayang at openjdk.org Fri Jan 26 13:08:44 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 Jan 2024 13:08:44 GMT Subject: RFR: 8324301: Obsolete MaxGCMinorPauseMillis In-Reply-To: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> References: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> Message-ID: On Mon, 22 Jan 2024 11:33:14 GMT, Albert Mingkun Yang wrote: > Simple obsoleting a deprecated jvm flag. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17517#issuecomment-1912037372 From ayang at openjdk.org Fri Jan 26 13:08:45 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 Jan 2024 13:08:45 GMT Subject: Integrated: 8324301: Obsolete MaxGCMinorPauseMillis In-Reply-To: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> References: <8_luDLuWf0--KpMbX-OD8RS0lnaiYYHbbyJ9FSCZJ2g=.ac949af5-4cac-4e8b-a44d-7a9eaa9803fe@github.com> Message-ID: On Mon, 22 Jan 2024 11:33:14 GMT, Albert Mingkun Yang wrote: > Simple obsoleting a deprecated jvm flag. This pull request has now been integrated. Changeset: 32ddcf50 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/32ddcf504c1f67e3d4bb0a6e8c9a523f4898dc74 Stats: 19 lines in 6 files changed: 1 ins; 17 del; 1 mod 8324301: Obsolete MaxGCMinorPauseMillis Reviewed-by: kbarrett, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/17517 From mbaesken at openjdk.org Fri Jan 26 14:43:50 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 26 Jan 2024 14:43:50 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v2] In-Reply-To: References: Message-ID: > Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. > > Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. > PhysicalMemory could be enhanced or a new event added. > > There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and > Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: remove comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17581/files - new: https://git.openjdk.org/jdk/pull/17581/files/466aed6c..824f3bc5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From aboldtch at openjdk.org Fri Jan 26 16:02:42 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 16:02:42 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> References: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> Message-ID: <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> On Thu, 25 Jan 2024 14:53:07 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Whitespace fix > > Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> Just did an initial read through of the PR. Just added some cleanup suggestion. Also noticed something I though looked wrong in the ARM32 port. I also went through and tried to find the handful of places in the codebase where the term `ICHolder` (or its derivatives) were still used. Put them in a separate branch to not clutter this PR. Would be nice to take this all the way and not have stale comments or naming lurking about. (Also nuked the `DECC` copy-paste-typo) Comment cleanups: f1bb02ea472eb314c93d80b830c59bd03e280116 All platforms use `data` as a register alias for the `CompileICData*` register in the `ic_check`. But c2i and itable stubs still use `holder`. Maybe go all the way here? 5422ed32def491bd1e145959b7f3c49c88cfc50e Also for PPC and s390 I think the code is easier to understand if the global inline cache register aliases these platforms have are used. But maybe that is just me. 39c0a7ede5187cba52d6fcf48c0852213c48c899 As for the implementation I could not see anything wrong (except the ARM32 port). But I'll leave it people with more expertise in this area. src/hotspot/cpu/arm/compiledIC_arm.cpp line 107: > 105: address stub = find_stub(); > 106: guarantee(stub != nullptr, "stub not found"); > 107: The other platforms removed the trace logging here. If the ARM porters still want this in at least update to log the correct class name. `s/CompiledDirectStaticCall/CompiledDirectCall/` src/hotspot/cpu/arm/sharedRuntime_arm.cpp line 631: > 629: > 630: __ ic_check(1 /* end_alignment */); > 631: __ ldr(Rmethod, Address(receiver_klass, CompiledICData::speculated_method_offset())); Maybe I am missing something here but this looks very wrong. The speculated `Klass*` gets loaded into `R4` (which `receiver_klass` alias) in `ic_check` this load would result in loading a `InstanceKlass*` c++ vtable pointer. `Ricklass` (`R8` alias) contains the `CompiledICData*` . I would think the correct diff would be - const Register receiver_klass = R4; - - __ load_klass(receiver_klass, receiver); - __ ldr(holder_klass, Address(Ricklass, CompiledICHolder::holder_klass_offset())); - __ ldr(Rmethod, Address(Ricklass, CompiledICHolder::holder_metadata_offset())); - __ cmp(receiver_klass, holder_klass); + __ ic_check(1 /* end_alignment */); + __ ldr(Rmethod, Address(Ricklass, CompiledICData::speculated_method_offset())); The fact that you say ARM32 tests are passing makes me doubt my understanding of the inline cache. src/hotspot/share/code/compiledIC.cpp line 195: > 193: c_ic->verify(); > 194: return c_ic; > 195: } Purely a style thing but could rewrite the `CompiledIC* CompiledIC_X(...)` functions in terms of each other. Made them all fit very well even on my small laptop screen. ```c++ CompiledIC* CompiledIC_before(CompiledMethod* nm, address return_addr) { address call_site = nativeCall_before(return_addr)->instruction_address(); return CompiledIC_at(nm, call_site); } CompiledIC* CompiledIC_at(CompiledMethod* nm, address call_site) { RelocIterator iter(nm, call_site, call_site + 1); iter.next(); return CompiledIC_at(&iter); } CompiledIC* CompiledIC_at(Relocation* call_reloc) { address call_site = call_reloc->addr(); CompiledMethod* cm = CodeCache::find_blob(call_reloc->addr())->as_compiled_method(); return CompiledIC_at(cm, call_site); } CompiledIC* CompiledIC_at(RelocIterator* reloc_iter) { CompiledIC* c_ic = new CompiledIC(reloc_iter); c_ic->verify(); return c_ic; } src/hotspot/share/code/compiledIC.cpp line 571: > 569: return true; > 570: } else if (cb->is_vtable_blob()) { > 571: return VtableStubs::is_icholder_entry(entry); `VtableStubs::is_icholder_entry` is no longer used. Should be removed as well. src/hotspot/share/code/compiledMethod.hpp line 381: > 379: void run_nmethod_entry_barrier(); > 380: > 381: // Verify and count cached icholder relocations. The comment belonged to the removed method. src/hotspot/share/oops/compiledICHolder.hpp line 44: > 42: > 43: > 44: class CompiledICHolder : public CHeapObj { Still has a forward declaration in `src/hotspot/share/oops/oopsHierarchy.hpp` src/hotspot/share/runtime/vmStructs.cpp line 215: > 213: volatile_nonstatic_field(ArrayKlass, _lower_dimension, ArrayKlass*) \ > 214: nonstatic_field(CompiledICHolder, _holder_metadata, Metadata*) \ > 215: nonstatic_field(CompiledICHolder, _holder_klass, Klass*) \ Making the serviceability agent aware of CompiledICData seems like a RFE. However CompiledICHolder has a mirror in the SA. Should probably nuke `src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/CompiledICHolder.java` ------------- PR Review: https://git.openjdk.org/jdk/pull/17495#pullrequestreview-1845953963 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467778057 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467786778 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467796274 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467803589 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467797143 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467804557 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1467806821 From ayang at openjdk.org Fri Jan 26 16:26:42 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 Jan 2024 16:26:42 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags Message-ID: Simple obsoleting four related deprecated jvm flags. ------------- Commit messages: - obsolete Changes: https://git.openjdk.org/jdk/pull/17592/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17592&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324771 Stats: 50 lines in 6 files changed: 4 ins; 45 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17592.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17592/head:pull/17592 PR: https://git.openjdk.org/jdk/pull/17592 From dcubed at openjdk.org Fri Jan 26 16:41:40 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 26 Jan 2024 16:41:40 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v15] In-Reply-To: References: Message-ID: <290Jv71Rn6o1i8RiWuiZFwdm0hf-PI6rcxCL_Qe8inw=.d04ed751-93de-41cb-81e9-f6ec332560ea@github.com> On Fri, 26 Jan 2024 09:21:00 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: > > - Update copyright > - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319797 > - Add verify calls > - Assert valid lock stack offset > - Typos, wording and whitespace > - Fix miss in is_recursive improvement > - Added comment about the rational behind full lock stack inflation. May need rewording > - Add logging when lock stack capacity is exceeded. > - Remove inaccurate comment > - Correct nomenclature balanced vs structured. > - ... and 40 more: https://git.openjdk.org/jdk/compare/c313d451...8df7f441 Thanks for making the updates. One minor typo. src/hotspot/share/runtime/lockStack.inline.hpp line 43: > 41: assert(is_aligned(offset, oopSize), "Bad alignment: %u", offset); > 42: assert((offset <= end_offset()), "lockstack overflow: offset %d end_offset %d", offset, end_offset()); > 43: assert((offset >= start_offset()), "lockstack underflow: offset %d end_offset %d", offset, start_offset()); nit typo: s/end_offset/start_offset/ ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16606#pullrequestreview-1846102159 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467872849 From dcubed at openjdk.org Fri Jan 26 16:41:42 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 26 Jan 2024 16:41:42 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v13] In-Reply-To: References: <-4blCtX3cNv-LQxfj3uisZ1CFD83mVNadvFYiX8UFik=.08d36172-cd61-4fc3-bebe-fa345b50d78a@github.com> <5oOncmWIUEnwPhmOI1AdX3bsCxJPZY99qZFaWyT1hsM=.88cc63dd-a2c5-40bb-859b-520b3e47c83d@github.com> Message-ID: On Fri, 26 Jan 2024 08:51:31 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/share/runtime/lockStack.inline.hpp line 103: >> >>> 101: } >>> 102: if (_base[i] == o) { >>> 103: // o can only occur in one consecutive run on the lock-stack. >> >> I'm not sure that the claim on L103 is always true. If we have a lock stack like this: >> >> >> _base[end - 1] = o1; >> _base[end - 2] = o2; >> _base[end - 3] = o1; >> _base[end - 4] = o1; >> >> >> When our `o == o1` we don't have a recursive run on the top-most part of >> the lock stack, but we do have one that's lower down. L103 isn't correct in >> this case, but that doesn't matter because we actually care about whether >> the top most run is recursive. I think L103 can be deleted and the rest of >> the comment is okay. > > The algorithm always inflates interleaved recursive locking. That lock stack would not occur. > The last enter on `o1` would see that `o1` is fast locked, but `try_recursive_enter` would fail (`o1` not top of lock stack) and inflate `o1` which removes all `o1` entiers of the lock-stack. Thanks for the explanation (and reminder about interleaved recursive locking). >> src/hotspot/share/runtime/objectMonitor.inline.hpp line 106: >> >>> 104: >>> 105: inline void ObjectMonitor::set_recursions(size_t recursions) { >>> 106: assert(_recursions == 0, "must be"); >> >> Why have the `recursions` parameter if the passed value must always be zero? >> >> Update: It looks like you might be trying to detect some out of sync count coming >> from the removal of the object from the lock stack. You expect it to always be a >> count value of 1 removal and if more than 1 is removed you want to assert(). > > The assert is on the pre-value of `ObjectMonitor::_recursions`. The parameter can have any value. > Maybe the assert is out of place. > From the recursive lightweight's perspective `ObjectMonitor::set_recursions` is part of the ObjectMonitors initialisation. Some other thread has created the ObjectMonitor for the locking thread, but filled in the placeholder values anonymous owner and set recursions to 0. The locking thread will at some point notice this and finish initialising the ObjectMonitor by setting the _owner and _recursions fields. There are also assumptions in the C2 code that the _recursions field when inflated by a non owning thread is set to 0. My mistake. I misread that the assertion was against `_recursions` and not `recursions`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467875582 PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1467876531 From lucy at openjdk.org Fri Jan 26 16:45:26 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 26 Jan 2024 16:45:26 GMT Subject: RFR: 8315762: Update subtype check profile collection on s390x following 8308869 [v2] In-Reply-To: <67cqHtFszZQRHZGhqR7LThot4PciYc1tznqzMTV348s=.1c6acb58-170d-44e8-a9ab-171be82f160a@github.com> References: <67cqHtFszZQRHZGhqR7LThot4PciYc1tznqzMTV348s=.1c6acb58-170d-44e8-a9ab-171be82f160a@github.com> Message-ID: On Thu, 25 Jan 2024 15:20:59 GMT, Amit Kumar wrote: >> s390x Implementation for https://github.com/openjdk/jdk/pull/14375 >> >> Benchmark Result with patch: >> >> Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units >> RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1155.409 ? 43.844 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 726.923 ? 54.536 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 676.462 ? 23.503 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 118.650 ? 2.653 ops/us >> >> >> Without Patch: >> >> Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units >> RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1101.248 ? 103.559 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 109.690 ? 3.312 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 110.790 ? 7.927 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 112.244 ? 6.889 ops/us >> >> >> Testing : Fastdebug build + tier1 tests > > Amit Kumar has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' of https://git.openjdk.org/jdk into subtype_v0 > - s390 Port LGTM. Please ensure proper testing, as Martin requested. Running tests with -XX:TierStopAtLevel=3 (effectively turning off C2) might be a good idea. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17461#pullrequestreview-1846114386 From duke at openjdk.org Fri Jan 26 16:45:49 2024 From: duke at openjdk.org (Liming Liu) Date: Fri, 26 Jan 2024 16:45:49 GMT Subject: Integrated: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 07:37:26 GMT, Liming Liu wrote: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
This pull request has now been integrated. Changeset: a65a8952 Author: Liming Liu Committer: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a Stats: 214 lines in 10 files changed: 199 ins; 7 del; 8 mod 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly Reviewed-by: jsjolen, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/15781 From aboldtch at openjdk.org Fri Jan 26 19:01:32 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Jan 2024 19:01:32 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v15] In-Reply-To: <290Jv71Rn6o1i8RiWuiZFwdm0hf-PI6rcxCL_Qe8inw=.d04ed751-93de-41cb-81e9-f6ec332560ea@github.com> References: <290Jv71Rn6o1i8RiWuiZFwdm0hf-PI6rcxCL_Qe8inw=.d04ed751-93de-41cb-81e9-f6ec332560ea@github.com> Message-ID: On Fri, 26 Jan 2024 16:35:22 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: >> >> - Update copyright >> - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319797 >> - Add verify calls >> - Assert valid lock stack offset >> - Typos, wording and whitespace >> - Fix miss in is_recursive improvement >> - Added comment about the rational behind full lock stack inflation. May need rewording >> - Add logging when lock stack capacity is exceeded. >> - Remove inaccurate comment >> - Correct nomenclature balanced vs structured. >> - ... and 40 more: https://git.openjdk.org/jdk/compare/c313d451...8df7f441 > > src/hotspot/share/runtime/lockStack.inline.hpp line 43: > >> 41: assert(is_aligned(offset, oopSize), "Bad alignment: %u", offset); >> 42: assert((offset <= end_offset()), "lockstack overflow: offset %d end_offset %d", offset, end_offset()); >> 43: assert((offset >= start_offset()), "lockstack underflow: offset %d end_offset %d", offset, start_offset()); > > nit typo: s/end_offset/start_offset/ Good. That is a copy-paste-typo. So two typo fixes coming up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16606#discussion_r1468035466 From coleenp at openjdk.org Fri Jan 26 20:13:59 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 Jan 2024 20:13:59 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files Message-ID: This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments or strings to just null. If you run into this and it bothers you after this push, you can change it in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. Ran tier1-4 testing. ------------- Commit messages: - 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files Changes: https://git.openjdk.org/jdk/pull/17593/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324681 Stats: 8196 lines in 750 files changed: 0 ins; 7 del; 8189 mod Patch: https://git.openjdk.org/jdk/pull/17593.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17593/head:pull/17593 PR: https://git.openjdk.org/jdk/pull/17593 From coleenp at openjdk.org Fri Jan 26 20:14:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 Jan 2024 20:14:47 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests Message-ID: If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. Tested with gtest, ------------- Commit messages: - Fix some strings - 8324678: Replace NULL with nullptr in HotSpot gtests Changes: https://git.openjdk.org/jdk/pull/17577/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17577&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324678 Stats: 562 lines in 74 files changed: 0 ins; 0 del; 562 mod Patch: https://git.openjdk.org/jdk/pull/17577.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17577/head:pull/17577 PR: https://git.openjdk.org/jdk/pull/17577 From coleenp at openjdk.org Fri Jan 26 20:26:17 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 Jan 2024 20:26:17 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v2] In-Reply-To: References: Message-ID: > This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments or strings to just null. If you run into this and it bothers you after this push, you can change it in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. > > Ran tier1-4 testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix the comments to "null" ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17593/files - new: https://git.openjdk.org/jdk/pull/17593/files/079b8931..e15a3a0b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=00-01 Stats: 229 lines in 103 files changed: 0 ins; 0 del; 229 mod Patch: https://git.openjdk.org/jdk/pull/17593.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17593/head:pull/17593 PR: https://git.openjdk.org/jdk/pull/17593 From kbarrett at openjdk.org Fri Jan 26 20:48:35 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 26 Jan 2024 20:48:35 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 21:35:29 GMT, Coleen Phillimore wrote: > If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. > Tested with gtest, Mostly good. A few nullptr -> null, and a couple other nits. test/hotspot/gtest/gc/shared/test_collectedHeap.cpp line 36: > 34: > 35: // Test that nullptr is not in the heap. > 36: ASSERT_FALSE(heap->is_in(nullptr)) << "nullptr is unexpectedly in the heap"; s/"nullptr is/"null is/ test/hotspot/gtest/gc/shared/test_oopStorage_parperf.cpp line 155: > 153: class OopStorageParIterPerf::Closure : public OopClosure { > 154: public: > 155: virtual void do_oop(oop* p) { guarantee(*p == nullptr, "expected nullptr"); } s/expected nullptr/expected null/ test/hotspot/gtest/memory/test_guardedMemory.cpp line 33: > 31: > 32: static void guarded_memory_test_check(void* p, size_t sz, void* tag) { > 33: ASSERT_TRUE(p != nullptr) << "nullptr pointer given to check"; s/nullptr pointer/null pointer/ test/hotspot/gtest/nmt/test_nmtpreinit.cpp line 87: > 85: p2 = os_realloc(os_malloc(10), 20); // realloc, growing > 86: p3 = os_realloc(os_malloc(20), 10); // realloc, shrinking > 87: p4 = os_realloc(nullptr, 10); // realloc with nullptr pointer s/nullptr pointer/null pointer/ test/hotspot/gtest/runtime/test_ThreadsListHandle.cpp line 216: > 214: // Test case: after first nested ThreadsListHandle (tlh2) has been destroyed > 215: > 216: // Verify the current thread's hazard ptr is nullptr: s/nullptr/null/ test/hotspot/gtest/runtime/test_ThreadsListHandle.cpp line 400: > 398: // Test case: after double nested ThreadsListHandle (tlh3) has been destroyed > 399: > 400: // Verify the current thread's hazard ptr is nullptr: s/nullptr/null/ test/hotspot/gtest/runtime/test_ThreadsListHandle.cpp line 448: > 446: // Test case: after first nested ThreadsListHandle (tlh2) has been destroyed > 447: > 448: // Verify the current thread's hazard ptr is nullptr: s/nullptr/null/ test/hotspot/gtest/runtime/test_ThreadsListHandle.cpp line 568: > 566: // Test case: after first back-to-back nested ThreadsListHandle (tlh2a) has been destroyed > 567: > 568: // Verify the current thread's hazard ptr is nullptr: s/nullptr/null/ test/hotspot/gtest/runtime/test_ThreadsListHandle.cpp line 646: > 644: // Test case: after second back-to-back nested ThreadsListHandle (tlh2b) has been destroyed > 645: > 646: // Verify the current thread's hazard ptr is nullptr: s/nullptr/null/ test/hotspot/gtest/runtime/test_classLoader.cpp line 34: > 32: bool bad_class_name = false; > 33: TempNewSymbol retval = ClassLoader::package_from_class_name(nullptr, &bad_class_name); > 34: ASSERT_TRUE(bad_class_name) << "Function did not set bad_class_name with nullptr class name"; s/nullptr/null/ test/hotspot/gtest/runtime/test_classLoader.cpp line 35: > 33: TempNewSymbol retval = ClassLoader::package_from_class_name(nullptr, &bad_class_name); > 34: ASSERT_TRUE(bad_class_name) << "Function did not set bad_class_name with nullptr class name"; > 35: ASSERT_TRUE(retval == nullptr) << "Wrong package for nullptr class name pointer"; second s/nullptr/null/ test/hotspot/gtest/runtime/test_os_linux.cpp line 203: > 201: HugeTlbfsMemory mr(p, size); > 202: // as the area around req_addr contains already existing mappings, the API should always > 203: // return nullptr (as per contract, it cannot return another address) s/nullptr/null/ test/hotspot/gtest/runtime/test_os_linux_cgroups.cpp line 66: > 64: TestCase at_mount_root = { > 65: "/sys/fs/cgroup", // mount_path > 66: nullptr, // root_path, ignored comment mis-indented test/hotspot/gtest/runtime/test_os_linux_cgroups.cpp line 72: > 70: TestCase sub_path = { > 71: "/sys/fs/cgroup", // mount_path > 72: nullptr, // root_path, ignored comment mis-indented test/hotspot/gtest/testutils.hpp line 57: > 55: // Mimicking the official ASSERT_xx and EXPECT_xx counterparts of the googletest suite. > 56: // (ASSERT|EXPECT)_NOT_NULL: check that the given pointer is not nullptr > 57: // (ASSERT|EXPECT)_NULL: check that the given pointer is nullptr both s/nullptr/null/ ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17577#pullrequestreview-1846506243 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468120668 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468120079 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468122196 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468123394 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468124477 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468124724 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468124981 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468125145 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468125309 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468125813 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468125982 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468126719 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468127182 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468127265 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468128708 From coleenp at openjdk.org Fri Jan 26 21:06:00 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 Jan 2024 21:06:00 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v3] In-Reply-To: References: Message-ID: <9n3_W-gEDmcSZz8z5V_d-93x1Gy2Zl005gPEepDdIC4=.f7906413-3999-4e0d-acf3-bc7d8cc1d89b@github.com> > This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments or strings to just null. If you run into this and it bothers you after this push, you can change it in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. > > Ran tier1-4 testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix nullptr only contained in strings. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17593/files - new: https://git.openjdk.org/jdk/pull/17593/files/e15a3a0b..33786c7d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=01-02 Stats: 19 lines in 3 files changed: 0 ins; 0 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/17593.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17593/head:pull/17593 PR: https://git.openjdk.org/jdk/pull/17593 From coleenp at openjdk.org Fri Jan 26 21:20:42 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 Jan 2024 21:20:42 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 20:41:17 GMT, Kim Barrett wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > test/hotspot/gtest/runtime/test_os_linux.cpp line 203: > >> 201: HugeTlbfsMemory mr(p, size); >> 202: // as the area around req_addr contains already existing mappings, the API should always >> 203: // return nullptr (as per contract, it cannot return another address) > > s/nullptr/null/ This is technically correct. Changing the comments like this seems really pointless. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1468156011 From coleenp at openjdk.org Fri Jan 26 21:28:01 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 Jan 2024 21:28:01 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: > If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. > Tested with gtest, Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix nullptr in comments and strings to null. @kimbarrett changes. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17577/files - new: https://git.openjdk.org/jdk/pull/17577/files/fefbd3d7..d8e3d92d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17577&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17577&range=00-01 Stats: 15 lines in 8 files changed: 0 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/17577.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17577/head:pull/17577 PR: https://git.openjdk.org/jdk/pull/17577 From kevinw at openjdk.org Fri Jan 26 21:31:36 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 26 Jan 2024 21:31:36 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Thu, 25 Jan 2024 22:10:01 GMT, Dean Long wrote: >> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. >> When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. >> >> There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. > > src/hotspot/share/runtime/javaThread.cpp line 1038: > >> 1036: // A ThreadLocalHandshake will mean deopt is complete. >> 1037: MonitorChunk* JavaThread::monitor_chunks_safe() const { >> 1038: MonitorChunk* chunks = _monitor_chunks; > > There's still a race here, right? The target thread isn't necessarily the same as the current thread, so it could deoptimize immediately after reading `_monitor_chunks`. I think for correctness it would always need to handshake. But then what's the point, if we always return null? Then this just turns into a strange synchronization mechanism. Right, target thread is commonly not the current thread (hence the occasional problem). At the moment yes what I have here is a strange synchronziation, waiting for a thing to be null, then returning null. It's a step towards removing the callers of monitor_chunks() from the few places that are not deoptimization itself. Only waiting really to do the assert. As you notice the call from rootResolver.cpp looks safe to remove: I see it is called from a safepoint. In this one in javaThread.cpp, yes it could deoptimize and set _monitor_chunks after we read it. If it does that... I'm checking into what can happen... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17566#discussion_r1468164254 From kevinw at openjdk.org Fri Jan 26 21:34:44 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 26 Jan 2024 21:34:44 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: ThreadsListHandle required for Handshake ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17566/files - new: https://git.openjdk.org/jdk/pull/17566/files/34d815ae..00659a8f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17566&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17566&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17566.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17566/head:pull/17566 PR: https://git.openjdk.org/jdk/pull/17566 From dcubed at openjdk.org Fri Jan 26 21:38:45 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 26 Jan 2024 21:38:45 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> References: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> Message-ID: On Fri, 26 Jan 2024 09:22:48 GMT, Axel Boldt-Christmas wrote: >> Implements the x86 port of JDK-8319796. >> >> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. >> >> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. >> >> Only if the recursive lightweight [un]lock fails does it look at the mark word. >> >> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. >> >> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. >> >> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. >> >> The x86 C2 port also has some extra oddities. >> >> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. >> >> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. >> >> The contended unlock was also moved to the code stub. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Update variable names in ad files > - Preload markWord unconditionally > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Add more expressive stub continuation names > - Remove outdated anonymous owner fix in stub > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - Remove C2HandleAnonOMOwnerStub definitions on x86. > - Add MFENCE comment > - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 > - ... and 11 more: https://git.openjdk.org/jdk/compare/414e1ba3...4d37c4b7 I did a crawl thru review of the v12 version via webrev. I only have minor comments and some questions. I like how the lightweight locking lock and unlock code is separated so you don't have to deal with the older baggage. Thumbs up. src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 142: > 140: #else > 141: // This relies on the implementation of lightweight_unlock knowing that it > 142: // will clobber its thread when using EAX. This use of `EAX` is confusing when earlier in this function `rax` is used. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 80: > 78: } > 79: > 80: void C2FastUnlockLightweightStub::emit(C2_MacroAssembler& masm) { I like this new stub and how clean and complete it is. I can't wait to see how it fits in with your other changes. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 96: > 94: } > 95: > 96: { // Restore held monitor and slow path. Perhaps: Restore held monitor count and slow path. src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 128: > 126: // Recheck successor. > 127: __ cmpptr(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(succ)), NULL_WORD); > 128: // Seen a successor after the release -> fence we have handed of the monitor Perhaps: // Observed a successor after the release -> fence we have handed off the monitor src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 131: > 129: __ jccb(Assembler::notEqual, fix_zf_and_unlocked); > 130: > 131: // Try to relock, if it fail the monitor has been handed over nit typo: s/it fail/it fails/ src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 141: > 139: > 140: __ bind(fix_zf_and_unlocked); > 141: __ xorl(rax, rax); Just curious: why use `xorl` here and `xorptr` on L135 above? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 561: > 559: Metadata* method_data, > 560: bool use_rtm, bool profile_rtm) { > 561: assert(LockingMode != LM_LIGHTWEIGHT, "uses fast_lock_lightweight"); I don't understand the string. Perhaps: "not for lightweight locking" or "lightweight locking should use fast_lock_lightweight". src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 614: > 612: testptr(objReg, objReg); > 613: } else { > 614: assert(LockingMode == LM_LEGACY, ""); Please change the string to "must be". src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 759: > 757: > 758: void C2_MacroAssembler::fast_unlock(Register objReg, Register boxReg, Register tmpReg, bool use_rtm) { > 759: assert(LockingMode != LM_LIGHTWEIGHT, "uses fast_unlock_lightweight"); I don't understand the string. Perhaps: "not for lightweight locking" or "lightweight locking should use fast_unlock_lightweight". src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1027: > 1025: jccb(Assembler::zero, zf_correct); > 1026: stop("Fast Lock ZF != 1"); > 1027: #endif I love this sanity check! src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1061: > 1059: stub = new (Compile::current()->comp_arena()) C2FastUnlockLightweightStub(obj, mark, reg_rax, thread); > 1060: Compile::current()->output()->add_stub(stub); > 1061: } So what happens if `stub` doesn't get generated? src/hotspot/cpu/x86/interp_masm_x86.cpp line 1315: > 1313: #else > 1314: // This relies on the implementation of lightweight_unlock knowing that it > 1315: // will clobber its thread when using EAX. This use of `EAX` is confusing when earlier in this function `rax` is used. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9930: > 9928: // obj: the object to be unlocked > 9929: // reg_rax: rax > 9930: // thread: the thread, may be EAX on x86_32 This use of `EAX` is confusing when the register is really `rax` and that's what the code below mentions. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9978: > 9976: // On x86_32 we may lose the thread. > 9977: get_thread(thread); > 9978: } In the header comment for this function we call it `EAX`. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1844845841 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1468160246 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467977169 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467962052 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467970901 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467971483 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467975520 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467980222 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467981510 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467990253 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467999992 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1468142901 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1468160513 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1468161120 PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1468162020 From dcubed at openjdk.org Fri Jan 26 21:38:46 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 26 Jan 2024 21:38:46 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> <-Na7iedqELkwRQd7vjQsAwbFTwn-xehrbOJusmoHyNo=.ff8c7fba-19d2-4a22-95d1-b7f1c8b3b8f1@github.com> Message-ID: <-BqDBdqRuHm7KTw5hbTZNELkHSo1e4_R-NL_2UJ50jQ=.aa545d76-f4ad-43b8-83e0-3fc386d1b277@github.com> On Thu, 25 Jan 2024 15:19:23 GMT, Coleen Phillimore wrote: >> Ah, nevermind. Leave it with extra argument for thread, then. > > I sort of liked the extra thread parameter so that the callers know that !LP64 needs get_thread() and not the lightweight_{un}lock. It's unfortunately inconsistent and requires an extra register, but nice that you can call it 'thread', at least until you overwrite it. I also like the thread parameter for the same reasons as @coleenp. I think making the nonsense we have to go thru for 32-bit/low-register-count platforms more obvious is better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467062134 From dcubed at openjdk.org Fri Jan 26 21:38:48 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 26 Jan 2024 21:38:48 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v12] In-Reply-To: References: <9CkUpwrZZgsMtG9MIM81ajl8weBVWyQR-8vFlYiYrNo=.6a30ad29-acfd-4d76-a4d5-c8ef5e7179c1@github.com> Message-ID: On Thu, 25 Jan 2024 08:44:24 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 77: >> >>> 75: >>> 76: int C2FastUnlockLightweightStub::max_size() const { >>> 77: return 128; >> >> Is this still 128? > > This is just used to preallocate the buffer when emitting stubs. Unused space gets truncated / used by the next stubs emission. (If I recall correctly the buffer is grown with at least 4KB at a time if offset() + next_stub->max_size() > buffer_end.) > > I remember it being somewhere around ~100 bytes depending on ASSERT. So 128 seemed like a good enough number to ensure that the stub could always be emitted. > > But maybe there is value in being more precise so that (assembler) changes which change (grow) the code emission size are captured early. Hmmm... It seems strange and against existing style to have the single value and not one value for `DEBUG` and a smaller value for `NOT_DEBUG`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1467068245 From kvn at openjdk.org Fri Jan 26 22:15:49 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 26 Jan 2024 22:15:49 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant Message-ID: When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. The fix is to start unlocking from most nested/inner monitor. I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. Added regression test with deep nested locks. Ran tier1-5, Xcomp, stress testing. ------------- Commit messages: - 8324174: assert(m->is_entered(current)) failed: invariant Changes: https://git.openjdk.org/jdk/pull/17600/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17600&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324174 Stats: 150 lines in 2 files changed: 148 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17600.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17600/head:pull/17600 PR: https://git.openjdk.org/jdk/pull/17600 From avoitylov at openjdk.org Sat Jan 27 07:55:38 2024 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Sat, 27 Jan 2024 07:55:38 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> References: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> Message-ID: On Fri, 26 Jan 2024 15:59:35 GMT, Axel Boldt-Christmas wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Whitespace fix >> >> Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> > > Just did an initial read through of the PR. Just added some cleanup suggestion. Also noticed something I though looked wrong in the ARM32 port. > > I also went through and tried to find the handful of places in the codebase where the term `ICHolder` (or its derivatives) were still used. Put them in a separate branch to not clutter this PR. Would be nice to take this all the way and not have stale comments or naming lurking about. (Also nuked the `DECC` copy-paste-typo) > Comment cleanups: > f1bb02ea472eb314c93d80b830c59bd03e280116 > > All platforms use `data` as a register alias for the `CompileICData*` register in the `ic_check`. But c2i and itable stubs still use `holder`. Maybe go all the way here? > 5422ed32def491bd1e145959b7f3c49c88cfc50e > > Also for PPC and s390 I think the code is easier to understand if the global inline cache register aliases these platforms have are used. But maybe that is just me. > 39c0a7ede5187cba52d6fcf48c0852213c48c899 > > As for the implementation I could not see anything wrong (except the ARM32 port). But I'll leave it people with more expertise in this area. I'll check the ARM32 part of @xmas92 comments early next week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1913061194 From aph at openjdk.org Sat Jan 27 11:48:25 2024 From: aph at openjdk.org (Andrew Haley) Date: Sat, 27 Jan 2024 11:48:25 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> Message-ID: On Wed, 17 Jan 2024 12:44:00 GMT, Fredrik Bredberg wrote: >> The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". >> >> However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. >> >> This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". >> >> Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. > > When I was browsing the interweb I saw that it's not uncommon to use isb instead of yield while spinning on AArch64. Before jumping on the bandwagon I created a test program to measure how long time it takes to issue a large number of instructions from several threads running in parallel. I tested nop, yield and isb on Apple's M1, M2 and M3 CPUs. The yield instruction doesn't take longer to execute than a nop instruction (in fact it takes less time than nop). However isb always takes significantly longer time to run than nop or yield on all of the above mentioned Apple CPUs. This finding combined with the fact that the JVM > today uses isb as default for Neoverse CPUs, justified the use of isb on Apple's M1-M3 CPUs. > > But I do agree with both @theRealAph and @stooart-mon, isb is not intended for this purpose. It might create a delay that is too long for spinning purposes and applications overall won't necessarily show any benefit from isb vs yield. > > Maybe the most reasonable way forward is to only change the default value of OnSpinWaitInst from "none" to "yield" and NOT change it to "isb" for Apple CPUs. > > After all, that would make us use the "correct" spinning instruction on all AArch64 CPUs (except Neoverse). @fbredber In 8320317 you said "The performance decrease seen on AArch64 based macOS can be fixed by implementing SpinPause() (see: JDK-8321371)." Please, where is the test case? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1913130279 From kevinw at openjdk.org Sat Jan 27 11:57:35 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Sat, 27 Jan 2024 11:57:35 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v3] In-Reply-To: <9n3_W-gEDmcSZz8z5V_d-93x1Gy2Zl005gPEepDdIC4=.f7906413-3999-4e0d-acf3-bc7d8cc1d89b@github.com> References: <9n3_W-gEDmcSZz8z5V_d-93x1Gy2Zl005gPEepDdIC4=.f7906413-3999-4e0d-acf3-bc7d8cc1d89b@github.com> Message-ID: <8G5I-7bp7jVg891XRUkSWJ6DMOoOBkhfkpNPNt40Ti0=.9e8391a5-9034-4b49-b0ee-ae3a62f37c32@github.com> On Fri, 26 Jan 2024 21:06:00 GMT, Coleen Phillimore wrote: >> This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. >> >> Ran tier1-4 testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix nullptr only contained in strings. test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTest/libVThreadTest.cpp line 426: > 424: } > 425: > 426: // #5: Test JVMTI GetLocalObject function with nullptr value_ptr "with null value_ptr" ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17593#discussion_r1468449036 From kevinw at openjdk.org Sat Jan 27 12:24:34 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Sat, 27 Jan 2024 12:24:34 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v3] In-Reply-To: <9n3_W-gEDmcSZz8z5V_d-93x1Gy2Zl005gPEepDdIC4=.f7906413-3999-4e0d-acf3-bc7d8cc1d89b@github.com> References: <9n3_W-gEDmcSZz8z5V_d-93x1Gy2Zl005gPEepDdIC4=.f7906413-3999-4e0d-acf3-bc7d8cc1d89b@github.com> Message-ID: On Fri, 26 Jan 2024 21:06:00 GMT, Coleen Phillimore wrote: >> This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. >> >> Ran tier1-4 testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix nullptr only contained in strings. That is long. Hypnotic to read. One very trivial nit, use or ignore as required, but all good that I can see. ------------- Marked as reviewed by kevinw (Committer). PR Review: https://git.openjdk.org/jdk/pull/17593#pullrequestreview-1847074612 From sroy at openjdk.org Sat Jan 27 17:34:51 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Sat, 27 Jan 2024 17:34:51 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v10] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: update comment for reveiew ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/212f16be..cbad4f9a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=08-09 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From sroy at openjdk.org Sat Jan 27 17:38:59 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Sat, 27 Jan 2024 17:38:59 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v11] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: update comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/cbad4f9a..257f5def Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From coleenp at openjdk.org Sat Jan 27 18:24:45 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Sat, 27 Jan 2024 18:24:45 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v4] In-Reply-To: References: Message-ID: > This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. > > Ran tier1-4 testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix one nullptr in comments as found by @kevinw ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17593/files - new: https://git.openjdk.org/jdk/pull/17593/files/33786c7d..6eb051ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17593.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17593/head:pull/17593 PR: https://git.openjdk.org/jdk/pull/17593 From kbarrett at openjdk.org Sat Jan 27 19:00:35 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 27 Jan 2024 19:00:35 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: <97zFHjCjYEftxWh3doPNGF1kHhPekiigR4503Rr6u70=.8243ba33-5146-4815-90da-ef5a4c30c7b3@github.com> On Fri, 26 Jan 2024 21:28:01 GMT, Coleen Phillimore wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix nullptr in comments and strings to null. @kimbarrett changes. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17577#pullrequestreview-1847182436 From mli at openjdk.org Sun Jan 28 09:29:26 2024 From: mli at openjdk.org (Hamlin Li) Date: Sun, 28 Jan 2024 09:29:26 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:14:05 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> JMH_PARAMS="-f 1 -wi 10 -i 20" // for every loop of jmh test >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`, more spcifically `TESTS="MessageDigests.digest MessageDigests.getAndDigest MessageDigestBench.digest"` >> >> >> // After >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 1845.446 ? 27.052 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 181455.350 ? 532.258 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2447.674 ? 10.239 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 182896.083 ? 1242.774 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 11599227.792 ? 121442.390 ns/op >> // Before >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 2352.475 ? 11.198 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 188495.684 ? 1467.942 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2437.347 ? 6.398 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 196086.570 ? 1140.998 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 12362160.119 ? 38788.109 ns/op >> >> >> **getAndDigest when size == 64** >> The data is not stable for test getAndDigest when size == 64, which I think is introduced by j.s.MessageDigest.getInstance itself, which we don't touch in this patch. >> Check more details at [1](ht... > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - remove tp/gp > - refine code Hey, Can I get more reviews? Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1913530720 From kbarrett at openjdk.org Sun Jan 28 20:28:26 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 28 Jan 2024 20:28:26 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v4] In-Reply-To: References: Message-ID: On Sat, 27 Jan 2024 18:24:45 GMT, Coleen Phillimore wrote: >> This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. >> >> Ran tier1-4 testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix one nullptr in comments as found by @kevinw Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17593#pullrequestreview-1847625136 From duke at openjdk.org Mon Jan 29 02:14:34 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 29 Jan 2024 02:14:34 GMT Subject: RFR: 8324186: Use "dmb.ishst+dmb.ishld" for release barrier [v4] In-Reply-To: References: Message-ID: On Tue, 23 Jan 2024 09:47:20 GMT, Andrew Haley wrote: >>> > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. >>> >>> I tried a number of different machines and saw regressions only on Kunpeng-920 (same CPU?) and A57 which is quite niche at this point. >> >> @nick-arm : Thanks for trying it out. Yeah, TSV110 is the core micro-arch name for Kunpeng-920. @theRealAph : I don't have access to the details of TSV110 any more, I guess it's not easy to figure out what's going on :-( > >> > I wonder if this was tested on other vendors' hardware? I witnessed some negative impact at least on HiSilicon TSV110 running the same JMH. So I guess it might be safer to go as a vendor-specific change. >> >> I tried a number of different machines and saw regressions only on Kunpeng-920 (same CPU?) and A57 which is quite niche at this point. > > Right, so it's probably a low-end, mostly-in-order thing. That makes sense because we're trading a weaker barrier for more instructions, and perhaps some cores implement barriers in a crude one-size-fits-all way. @theRealAph @RealFYang Could you help to sponsor it? Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/17511#issuecomment-1913845465 From duke at openjdk.org Mon Jan 29 02:28:34 2024 From: duke at openjdk.org (Yude Lin) Date: Mon, 29 Jan 2024 02:28:34 GMT Subject: RFR: 8323273: AArch64: Strengthen CompressedClassPointers initialization check for base In-Reply-To: References: Message-ID: <7dAbar5X0NOWm-eDJPMsJH04RA0pPHf0NrjeyuMtc1Q=.5183e2d0-48c3-4f74-93aa-65f2f730fa6f@github.com> On Tue, 16 Jan 2024 02:41:40 GMT, Yude Lin wrote: > Summary: > Add a platform-dependent check for CompressedClassSpaceBaseAddress; > Remove the "reserve anywhere" attempt after the initial mapping attempt failed---this is rarely used and will likely fail anyway, because the accepted mapping is very restricted on aarch64; > Additional assertions after initialization. > > Passed hotspot/jtreg/:tier1 on fastdebug Ping. Can I get a review for this change? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17437#issuecomment-1913854760 From dholmes at openjdk.org Mon Jan 29 02:28:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 29 Jan 2024 02:28:37 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Fri, 26 Jan 2024 21:34:44 GMT, Kevin Walls wrote: >> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. >> When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. >> >> There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > ThreadsListHandle required for Handshake I'm really not grokking this. What is the purpose of `monitor_chunks()`? If it is only non-null during de-opt then any examination of it outside of deopt is racy. Is the issue that once deopt is complete the monitors that would have been found in `monitor_chunks` will now be found elsewhere? If so then where? ------------- PR Review: https://git.openjdk.org/jdk/pull/17566#pullrequestreview-1847735764 From dholmes at openjdk.org Mon Jan 29 04:28:25 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 29 Jan 2024 04:28:25 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 22:09:27 GMT, Vladimir Kozlov wrote: > The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. Just to clarify my understanding here, when we iterate the list of monitors, the `BasicObjectLock` is different depending on whether the object is initially locked or recursively locked - is that the case? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17600#issuecomment-1913938731 From fyang at openjdk.org Mon Jan 29 04:32:37 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 29 Jan 2024 04:32:37 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: <81AS__0K8p3smBkoctLWNUDZ4UHm6gLKdl8SLXp5RCM=.4ad183af-6754-4bb3-92e0-022314f055f3@github.com> On Wed, 24 Jan 2024 18:29:22 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Thank you Thomas for having a look. > > I think that fixing the issue in G1 alone takes us pretty far. If you really care about latency in your non-small app that unloads tens of thousands of classes at a time, then it seems just plain weird to sit there with Serial or Parallel complaining about latencies. > I believe this was the last major piece in task we started over 5y ago of removing runtime safepoint and latencies. Or as some might say, we finally have a runtime good enough to run ZGC ;) (and Shenandoah). Thank you @fisk for completing this milestone! > > Risc-v passes my testing. (vf2 (t1) + qemu (t1-2), ran twice, once on v3 branch and once this pr) > > @RealFYang can you please review RV changes? @robehn @fisk : I went through the RV changes and I saw small code cleanup could be done. [rv-extra-cleanup.diff.txt](https://github.com/openjdk/jdk/files/14079614/rv-extra-cleanup.diff.txt) BTW: I also performed tier1-3 and hotspot:tier4 tests on hifive unmatched board, result is good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1913942094 From dholmes at openjdk.org Mon Jan 29 05:18:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 29 Jan 2024 05:18:37 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 21:28:01 GMT, Coleen Phillimore wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix nullptr in comments and strings to null. @kimbarrett changes. Looks good. Part of the cleanup with using `nullptr` is that we should never need to cast `nullptr` to another pointer type. I flagged a few cases below but certainly not all. For text/comments the general rule is that when talking about the general concept of nullness then we can say "null", whereas if referring to an actual code artifact we use "nullptr". Sometimes either will be fine e.g "returns null on error", "returns nullptr on error". Thanks. test/hotspot/gtest/gc/shared/test_bufferNodeAllocator.cpp line 65: > 63: ASSERT_EQ(0u, allocator.free_count()); > 64: nodes[i] = allocator.allocate(); > 65: ASSERT_EQ((BufferNode*)nullptr, nodes[i]->next()); We often find that once we use `nullptr` casts like this are not needed. test/hotspot/gtest/logging/test_log.cpp line 64: > 62: ResourceMark rm; > 63: FILE* fp = os::fopen(TestLogFileName, "r"); > 64: ASSERT_NE((void*)nullptr, fp); Cast should not be needed test/hotspot/gtest/metaspace/metaspaceGtestCommon.hpp line 148: > 146: > 147: ////////////////////////////////////////////////////////// > 148: // Some helpers to avoid typing out those annoying casts for nullptr Casts for nullptr should no longer be needed test/hotspot/gtest/nmt/test_nmt_buffer_overflow_detection.cpp line 112: > 110: > 111: static void test_invalid_block_address() { > 112: // very low, like the result of an overflow or of accessing a nullptr this pointer s/nullptr/null This is referring to nullness as a concept. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17577#pullrequestreview-1847827312 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469074814 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469077391 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469078466 PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469083197 From dholmes at openjdk.org Mon Jan 29 05:32:34 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 29 Jan 2024 05:32:34 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v4] In-Reply-To: References: Message-ID: On Sat, 27 Jan 2024 18:24:45 GMT, Coleen Phillimore wrote: >> This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. >> >> Ran tier1-4 testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix one nullptr in comments as found by @kevinw Looks good. Thanks for doing the text changes as well, they are a necessary part of the cleanup. A number of files are missing copyright updates - the ones I spotted all had a single copyright year so maybe your script missed them? My browser found 21 places where nullptr is cast to something else, which should no longer be needed. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17593#pullrequestreview-1847855251 From dholmes at openjdk.org Mon Jan 29 05:45:34 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 29 Jan 2024 05:45:34 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 16:18:27 GMT, Albert Mingkun Yang wrote: > Simple obsoleting four related deprecated jvm flags. Code changes look good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17592#pullrequestreview-1847867460 From vlivanov at openjdk.org Mon Jan 29 06:40:28 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 29 Jan 2024 06:40:28 GMT Subject: RFR: 8324433: Introduce a way to determine if an expression is evaluated as a constant by the Jit compiler [v7] In-Reply-To: References: Message-ID: On Thu, 25 Jan 2024 14:01:59 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch introduces `JitCompiler::isConstantExpression` which can be used to statically determine whether an expression has been constant-folded by the Jit compiler, leading to more constant-folding opportunities. For example, it can be used in `MemorySessionImpl::checkValidStateRaw` to eliminate the lifetime check on global sessions without imposing additional branches on other non-global sessions. This is similar to `__builtin_constant_p` in GCC and clang. >> >> Please kindly give your opinion as well as your reviews, thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > change expr to val, add examples I share Roland's concerns w.r.t. profiling. If there's any code guarded by `isCompileConstant(value) == true`, the only way to trigger its profiling is by deoptimizing from C2-generated code. I added `MHI.isCompileConstant` intrinsic as part of a point fix for a performance problem caused by Java-level code profiling/specialization happening in `java.lang.invoke`. It guards profiling logic which is pruned completely once C2 kicks in. So, absence of profiling is not a problem there. Also, there's a constraint on implementation side: the current implementation supports only parse-time folding. If a value turns into a constant later (either during parsing after the call is encountered or during post-parsing phase), it won't have any effect. So, as it is now (both on API and implementation sides) it's hard to correctly use `isCompileConstant` for more general cases. It would be helpful to see more examples illustrating possible usage scenarios. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17527#issuecomment-1914052884 From aboldtch at openjdk.org Mon Jan 29 06:42:46 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Jan 2024 06:42:46 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v16] In-Reply-To: References: Message-ID: <8QYhYxEf3TYCwrrN430BZuaY4zYhK0sd2x8BTCaAojI=.96c15000-b013-41e6-8a94-5e3e6f95172c@github.com> > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Fix assert comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16606/files - new: https://git.openjdk.org/jdk/pull/16606/files/8df7f441..e368ea26 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=14-15 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From rehn at openjdk.org Mon Jan 29 07:09:41 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 29 Jan 2024 07:09:41 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 18:29:22 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Thank you Thomas for having a look. > > I think that fixing the issue in G1 alone takes us pretty far. If you really care about latency in your non-small app that unloads tens of thousands of classes at a time, then it seems just plain weird to sit there with Serial or Parallel complaining about latencies. > > I believe this was the last major piece in task we started over 5y ago of removing runtime safepoint and latencies. Or as some might say, we finally have a runtime good enough to run ZGC ;) (and Shenandoah). Thank you @fisk for completing this milestone! > > Risc-v passes my testing. (vf2 (t1) + qemu (t1-2), ran twice, once on v3 branch and once this pr) > > @RealFYang can you please review RV changes? > > @robehn @fisk : I went through the RV changes and I saw small code cleanup could be done. [rv-extra-cleanup.diff.txt](https://github.com/openjdk/jdk/files/14079614/rv-extra-cleanup.diff.txt) > > BTW: I also performed tier1-3 and hotspot:tier4 tests on hifive unmatched board, result is good. Thank you @RealFYang ! @fisk please apply [rv-extra-cleanup.diff.txt](https://github.com/openjdk/jdk/files/14079614/rv-extra-cleanup.diff.txt) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1914084130 From mbaesken at openjdk.org Mon Jan 29 08:10:35 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 29 Jan 2024 08:10:35 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags In-Reply-To: References: Message-ID: <4yt8w37hoyGedzBTRhW4XPvC0D6UwZ3mWCbG5Fm_ufs=.47e5ba60-21a7-4cbd-afe1-dbe026ef2156@github.com> On Fri, 26 Jan 2024 16:18:27 GMT, Albert Mingkun Yang wrote: > Simple obsoleting four related deprecated jvm flags. Please check the COPYRIGHT years (e.g. src/hotspot/share/gc/z/zArguments.cpp ). Otherwise looks good. ------------- Marked as reviewed by mbaesken (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17592#pullrequestreview-1848039405 From fyang at openjdk.org Mon Jan 29 08:10:36 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 29 Jan 2024 08:10:36 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: References: Message-ID: <_KHcFGW7NvUC6HM6YCi0D5e-n9t11OZk4GBKCiIncQ4=.83dbe8c8-09ca-4919-81f0-6b0ca066075e@github.com> On Sun, 28 Jan 2024 09:26:18 GMT, Hamlin Li wrote: > Hey, Can I get more reviews? Thanks Hi, will take another look. Can you merge with master? I see bot added merge-conflicts label. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1914164125 From jbechberger at openjdk.org Mon Jan 29 08:36:40 2024 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 29 Jan 2024 08:36:40 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v2] In-Reply-To: References: Message-ID: <5YcRhksL3oM7QJVb2kCKU29T6VVN3VVghhQeDfUOfUE=.8f35a44e-7b62-4b4b-bee8-c946dadd26fd@github.com> On Fri, 26 Jan 2024 14:43:50 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove comment Marked as reviewed by jbechberger (Committer). src/hotspot/share/jfr/metadata/metadata.xml line 931: > 929: > 930: > 931: "to the OS", sorry for the nitpicking. The PR looks good otherwise. ------------- PR Review: https://git.openjdk.org/jdk/pull/17581#pullrequestreview-1848086354 PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1469234741 From mbaesken at openjdk.org Mon Jan 29 08:41:29 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 29 Jan 2024 08:41:29 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v2] In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 14:43:50 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove comment Thanks for the review ! One remaining question is (os_linux) "do we need to handle container envs separately ?" Will maybe ask Severin gehwolf about his opinion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17581#issuecomment-1914209919 From mbaesken at openjdk.org Mon Jan 29 09:00:51 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 29 Jan 2024 09:00:51 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v3] In-Reply-To: References: Message-ID: > Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. > > Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. > PhysicalMemory could be enhanced or a new event added. > > There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and > Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Adjust description, fix some indentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17581/files - new: https://git.openjdk.org/jdk/pull/17581/files/824f3bc5..a9114b1a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=01-02 Stats: 19 lines in 3 files changed: 4 ins; 6 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From lucy at openjdk.org Mon Jan 29 09:30:40 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 29 Jan 2024 09:30:40 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v2] In-Reply-To: References: Message-ID: <5uwBybPsGcTZi5GHG4Bu357AqGAn89-tzlRF3o8gW78=.957cde46-e3b2-4e9a-bd2e-0b6793c7326d@github.com> On Fri, 26 Jan 2024 14:43:50 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove comment Please use globally defined constants. Looks good otherwise. src/hotspot/os/aix/os_aix.cpp line 281: > 279: return -1; > 280: } > 281: return (jlong)(memory_info.pgsp_total * 4L * 1024L); please use 4 * K instead of 4L * 1024L. K is declared as size_t in globalDefinitions.hpp. src/hotspot/os/aix/os_aix.cpp line 289: > 287: return -1; > 288: } > 289: return (jlong)(memory_info.pgsp_free * 4L * 1024L); Same as above. ------------- Changes requested by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17581#pullrequestreview-1848181220 PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1469292777 PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1469291730 From duke at openjdk.org Mon Jan 29 09:36:48 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 29 Jan 2024 09:36:48 GMT Subject: Integrated: 8324186: Use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 22 Jan 2024 01:58:32 GMT, kuaiwei wrote: > Details is https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2024-January/071921.html. > Using a combined dmb.ish for release barrier will introduce a heavy storeload barrier. Use "dmb.ishst+dmb.ishld" pair instead, we can gain performance improvement on N1 and N2 architecture. The benchmark is test/micro/org/openjdk/bench/vm/compiler/FinalFieldInitialize.java > Run with ParallelGC to minimalize impact of gc barrier. > > make test TEST="micro:org.openjdk.bench.vm.compiler.FinalFieldInitialize" MICRO="VM_OPTIONS=-XX:+UseParallelGC" > ... > FinalFieldInitialize.testAllocWithFinal thrpt 9 1411.601 ? 6.546 ops/s > > Without the patch > > FinalFieldInitialize.testAllocWithFinal thrpt 9 1214.575 ? 14.217 ops/s This pull request has now been integrated. Changeset: 628348d3 Author: Kuai Wei Committer: Andrew Haley URL: https://git.openjdk.org/jdk/commit/628348d3e97b669ab4136b1749b8fccf373eb2a0 Stats: 108 lines in 5 files changed: 98 ins; 0 del; 10 mod 8324186: Use "dmb.ishst+dmb.ishld" for release barrier Reviewed-by: fyang, aph ------------- PR: https://git.openjdk.org/jdk/pull/17511 From tschatzl at openjdk.org Mon Jan 29 09:41:40 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Jan 2024 09:41:40 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints In-Reply-To: <9GolX3m7SkG4Fs0KTN5qMRxVK47eAhPLdmlaO3oGSKc=.736c3053-abc5-45ce-bd54-85c4f70c3fc9@github.com> References: <9GolX3m7SkG4Fs0KTN5qMRxVK47eAhPLdmlaO3oGSKc=.736c3053-abc5-45ce-bd54-85c4f70c3fc9@github.com> Message-ID: On Thu, 25 Jan 2024 06:43:59 GMT, Martin Doerr wrote: > On linux, the time for "Purge Unlinked NMethods" goes down when I comment out delete ic->data(); and ignore the memory leak. (MacOS seems to be ok with it.) >Adding trace code to purge_ic_callsites shows that we often have 0 or 2 ICData instances, sometimes up to 30 ones. >It would be good to think a bit about the allocation scheme. Some ideas would be > Allocate ICData in an array per nmethod instead of individually. That should help to some degree and also improve data locality (and hence cache efficiency). Would also save iterating over the relocations when purging unlinked NMethods. It's not very complex. > Instead of freeing ICData instances, we could enqueue them and either reuse or free them during a concurrent phase. This may be a bit complicated. Not sure if it's worth it. > Allocate in Metaspace? Sorry for being unresponsive for a bit. Yes, the issue is the new `delete ic->data()`; but also the iteration over the relocinfo here is almost as expensive in my tests. So the idea to allocate ICData in a per nmethod basis (and actually some other existing C heap allocations that are also `delete`d in this phase) seems the most promising to me. Also came up with the other suggestions, but I think that first one seems best to me at first glance. I did not really like the second because enqueuing adds another indirection for first gathering all of them and then separately free them. Metaspace is something I do not know that well to comment on that option. I am open to moving this improvement, if it is not easy to do, into a separate CR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1914310490 From shade at openjdk.org Mon Jan 29 09:42:52 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Jan 2024 09:42:52 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v4] In-Reply-To: References: Message-ID: > We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. > > There are a few interesting conversions along the way: > 1. `intptr_t` -> `uint32_t` (this method) > 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) > 3. `int32_t` -> `uint32_t` (in `emit_int32`) > > I believe these are safe after `is_uimm32` check, but please check (sic) me on this. > > Note that x86_64 matcher already does similar thing for immediates: > > > // Long Immediate 32-bit unsigned > operand immUL32() > %{ > predicate(n->get_long() == (unsigned int) (n->get_long())); > match(ConL); > ... > %} > > instruct loadConUL32(rRegL dst, immUL32 src) > %{ > ... > format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} > ins_encode %{ > __ movl($dst$$Register, $src$$constant); > %} > %} > > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > > Code sizes for `Hello World`, `-Xcomp`: > > > # Before > tier1 nmethod code size : 426208 bytes > tier2 nmethod code size : 462880 bytes > tier3 nmethod code size : 889992 bytes > tier4 nmethod code size : 1244448 bytes > > # After > tier1 nmethod code size : 425768 bytes (-0.1%) > tier2 nmethod code size : 462400 bytes (-0.1%) > tier3 nmethod code size : 882072 bytes (-0.8%) > tier4 nmethod code size : 1236448 bytes (-0.6%) Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into JDK-8323503-x86-movptr-unsigned - Revert "Just do checked_cast" This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. - Just do checked_cast - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17343/files - new: https://git.openjdk.org/jdk/pull/17343/files/f50510ce..7d5dbcc2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17343&range=02-03 Stats: 34942 lines in 1096 files changed: 19696 ins; 11200 del; 4046 mod Patch: https://git.openjdk.org/jdk/pull/17343.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17343/head:pull/17343 PR: https://git.openjdk.org/jdk/pull/17343 From aph at openjdk.org Mon Jan 29 09:45:37 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 29 Jan 2024 09:45:37 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 22:30:38 GMT, Jiangli Zhou wrote: > Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like `objcopy` can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that? I suppose so, but why? Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1914316881 From jkern at openjdk.org Mon Jan 29 09:51:40 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 29 Jan 2024 09:51:40 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v11] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Sat, 27 Jan 2024 17:38:59 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > update comment src/hotspot/os/aix/os_aix.cpp line 1166: > 1164: Search order: > 1165: libfilename-> load "libfilename.so" first,then load libfilename.a,on failure. > 1166: In,OpenJ9,the libary with .so extension is loaded first and then .a extension,on failure. Hi Suchi, I'm puzzled. Your comment implies for me, that load library gets a 'base' filename without 'lib' prefix and without extension (e.g. 'name'). Then the j9 code creates the filename 'libname.so' first and on failure 'libname.a' second. What about given libname.so explicitly (e.g. libname.so)? Does j9 really use 'libname.a' as a failure fallback in this case? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1469331769 From fbredberg at openjdk.org Mon Jan 29 10:04:36 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 29 Jan 2024 10:04:36 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> Message-ID: <7_9a9sbBPkOB5WI524S8boJqrRRKeVTskICLT_USdz8=.4f80a378-bdad-477d-b8ac-fc7a42529dd0@github.com> On Sat, 27 Jan 2024 11:45:43 GMT, Andrew Haley wrote: >> When I was browsing the interweb I saw that it's not uncommon to use isb instead of yield while spinning on AArch64. Before jumping on the bandwagon I created a test program to measure how long time it takes to issue a large number of instructions from several threads running in parallel. I tested nop, yield and isb on Apple's M1, M2 and M3 CPUs. The yield instruction doesn't take longer to execute than a nop instruction (in fact it takes less time than nop). However isb always takes significantly longer time to run than nop or yield on all of the above mentioned Apple CPUs. This finding combined with the fact that the JVM >> today uses isb as default for Neoverse CPUs, justified the use of isb on Apple's M1-M3 CPUs. >> >> But I do agree with both @theRealAph and @stooart-mon, isb is not intended for this purpose. It might create a delay that is too long for spinning purposes and applications overall won't necessarily show any benefit from isb vs yield. >> >> Maybe the most reasonable way forward is to only change the default value of OnSpinWaitInst from "none" to "yield" and NOT change it to "isb" for Apple CPUs. >> >> After all, that would make us use the "correct" spinning instruction on all AArch64 CPUs (except Neoverse). > > @fbredber In https://bugs.openjdk.org/browse/JDK-8320317 you said "The performance decrease seen on AArch64 based macOS can be fixed by implementing SpinPause() (see: JDK-8321371)." > > Please, where is the test case? @theRealAph The DaCapo-h2 test indicated that the regression could be mitigated by implementing SpinPause(). Since there is no consensus about if ISB is a good idea or not, we have decided not to use it as default for Apple silicon and just use YIELD for all AArch64 CPUs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1914351894 From aph at openjdk.org Mon Jan 29 10:16:37 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 29 Jan 2024 10:16:37 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: <_rjhqI6cYZAWvwIvX76-ERmCCxL42ij_FnFFbFap30k=.be2f9c27-1a09-4e0a-8f4e-be7ccba6f597@github.com> Message-ID: On Sat, 27 Jan 2024 11:45:43 GMT, Andrew Haley wrote: >> When I was browsing the interweb I saw that it's not uncommon to use isb instead of yield while spinning on AArch64. Before jumping on the bandwagon I created a test program to measure how long time it takes to issue a large number of instructions from several threads running in parallel. I tested nop, yield and isb on Apple's M1, M2 and M3 CPUs. The yield instruction doesn't take longer to execute than a nop instruction (in fact it takes less time than nop). However isb always takes significantly longer time to run than nop or yield on all of the above mentioned Apple CPUs. This finding combined with the fact that the JVM >> today uses isb as default for Neoverse CPUs, justified the use of isb on Apple's M1-M3 CPUs. >> >> But I do agree with both @theRealAph and @stooart-mon, isb is not intended for this purpose. It might create a delay that is too long for spinning purposes and applications overall won't necessarily show any benefit from isb vs yield. >> >> Maybe the most reasonable way forward is to only change the default value of OnSpinWaitInst from "none" to "yield" and NOT change it to "isb" for Apple CPUs. >> >> After all, that would make us use the "correct" spinning instruction on all AArch64 CPUs (except Neoverse). > > @fbredber In https://bugs.openjdk.org/browse/JDK-8320317 you said "The performance decrease seen on AArch64 based macOS can be fixed by implementing SpinPause() (see: JDK-8321371)." > > Please, where is the test case? > @theRealAph The DaCapo-h2 test indicated that the regression could be mitigated by implementing SpinPause(). > > Since there is no consensus about if ISB is a good idea or not, we have decided not to use it as default for Apple silicon and just use YIELD for all AArch64 CPUs. But there's been no consensus because (as far as I know) no-one has published the test results. With evidence we can discuss, consensus should be achievable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1914373658 From jwaters at openjdk.org Mon Jan 29 11:02:25 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 29 Jan 2024 11:02:25 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: <8VU25vMset1F5Sb3DjBkAGp6n-H9P6wawvfF3PyPjjI=.6a01a9d6-eaea-498f-bc2b-093e090a6859@github.com> On Fri, 26 Jan 2024 21:28:01 GMT, Coleen Phillimore wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix nullptr in comments and strings to null. @kimbarrett changes. Marked as reviewed by jwaters (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17577#pullrequestreview-1848398138 From mli at openjdk.org Mon Jan 29 11:39:01 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 29 Jan 2024 11:39:01 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v5] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to implement SHA-1 intrinsic for riscv? > Thanks! > > > ## Test > > ### Functionality > > tests under `test/hotspot/jtreg/compiler/intrinsics/sha` > tests found via `find test/jdk -iname "*SHA1*.java"` > > ### Performance > > tested on `T-HEAD Light Lichee Pi 4A` > > JMH_PARAMS="-f 1 -wi 10 -i 20" // for every loop of jmh test > > benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`, more spcifically `TESTS="MessageDigests.digest MessageDigests.getAndDigest MessageDigestBench.digest"` > > > // After > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 1845.446 ? 27.052 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 181455.350 ? 532.258 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2447.674 ? 10.239 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 182896.083 ? 1242.774 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 11599227.792 ? 121442.390 ns/op > // Before > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 2352.475 ? 11.198 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 188495.684 ? 1467.942 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2437.347 ? 6.398 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 196086.570 ? 1140.998 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 12362160.119 ? 38788.109 ns/op > > > **getAndDigest when size == 64** > The data is not stable for test getAndDigest when size == 64, which I think is introduced by j.s.MessageDigest.getInstance itself, which we don't touch in this patch. > Check more details at [1](https://github.com/openjdk/jdk/pull/17130#issuecomment-1886805614) > > > loop ... ... Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - merge master; refactor UseSHAxxx - remove tp/gp - refine code - round 1 review - Add some comments - Initial commit ------------- Changes: https://git.openjdk.org/jdk/pull/17130/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=04 Stats: 430 lines in 5 files changed: 400 ins; 16 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From mli at openjdk.org Mon Jan 29 11:46:56 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 29 Jan 2024 11:46:56 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v6] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to implement SHA-1 intrinsic for riscv? > Thanks! > > > ## Test > > ### Functionality > > tests under `test/hotspot/jtreg/compiler/intrinsics/sha` > tests found via `find test/jdk -iname "*SHA1*.java"` > > ### Performance > > tested on `T-HEAD Light Lichee Pi 4A` > > JMH_PARAMS="-f 1 -wi 10 -i 20" // for every loop of jmh test > > benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`, more spcifically `TESTS="MessageDigests.digest MessageDigests.getAndDigest MessageDigestBench.digest"` > > > // After > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 1845.446 ? 27.052 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 181455.350 ? 532.258 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2447.674 ? 10.239 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 182896.083 ? 1242.774 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 11599227.792 ? 121442.390 ns/op > // Before > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 2352.475 ? 11.198 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 188495.684 ? 1467.942 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2437.347 ? 6.398 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 196086.570 ? 1140.998 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 12362160.119 ? 38788.109 ns/op > > > **getAndDigest when size == 64** > The data is not stable for test getAndDigest when size == 64, which I think is introduced by j.s.MessageDigest.getInstance itself, which we don't touch in this patch. > Check more details at [1](https://github.com/openjdk/jdk/pull/17130#issuecomment-1886805614) > > > loop ... ... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: Delete src/.vscode/settings.json ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17130/files - new: https://git.openjdk.org/jdk/pull/17130/files/7de9b6af..2c79ea08 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=04-05 Stats: 7 lines in 1 file changed: 0 ins; 7 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From mli at openjdk.org Mon Jan 29 12:23:02 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 29 Jan 2024 12:23:02 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v7] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to implement SHA-1 intrinsic for riscv? > Thanks! > > > ## Test > > ### Functionality > > tests under `test/hotspot/jtreg/compiler/intrinsics/sha` > tests found via `find test/jdk -iname "*SHA1*.java"` > > ### Performance > > tested on `T-HEAD Light Lichee Pi 4A` > > JMH_PARAMS="-f 1 -wi 10 -i 20" // for every loop of jmh test > > benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`, more spcifically `TESTS="MessageDigests.digest MessageDigests.getAndDigest MessageDigestBench.digest"` > > > // After > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 1845.446 ? 27.052 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 181455.350 ? 532.258 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2447.674 ? 10.239 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 182896.083 ? 1242.774 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 11599227.792 ? 121442.390 ns/op > // Before > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 20 2352.475 ? 11.198 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 20 188495.684 ? 1467.942 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 20 2437.347 ? 6.398 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 20 196086.570 ? 1140.998 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 20 12362160.119 ? 38788.109 ns/op > > > **getAndDigest when size == 64** > The data is not stable for test getAndDigest when size == 64, which I think is introduced by j.s.MessageDigest.getInstance itself, which we don't touch in this patch. > Check more details at [1](https://github.com/openjdk/jdk/pull/17130#issuecomment-1886805614) > > > loop ... ... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: revert string change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17130/files - new: https://git.openjdk.org/jdk/pull/17130/files/2c79ea08..7224b497 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From mli at openjdk.org Mon Jan 29 12:23:02 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 29 Jan 2024 12:23:02 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v4] In-Reply-To: <_KHcFGW7NvUC6HM6YCi0D5e-n9t11OZk4GBKCiIncQ4=.83dbe8c8-09ca-4919-81f0-6b0ca066075e@github.com> References: <_KHcFGW7NvUC6HM6YCi0D5e-n9t11OZk4GBKCiIncQ4=.83dbe8c8-09ca-4919-81f0-6b0ca066075e@github.com> Message-ID: On Mon, 29 Jan 2024 08:08:03 GMT, Fei Yang wrote: > > Hey, Can I get more reviews? Thanks > > Hi, will take another look. Can you merge with master? I see bot added merge-conflicts label. Thanks. Updated! Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17130#issuecomment-1914581784 From ayang at openjdk.org Mon Jan 29 12:42:46 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 Jan 2024 12:42:46 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags [v2] In-Reply-To: References: Message-ID: > Simple obsoleting four related deprecated jvm flags. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17592/files - new: https://git.openjdk.org/jdk/pull/17592/files/c4894320..ee96cc7a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17592&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17592&range=00-01 Stats: 6 lines in 6 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/17592.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17592/head:pull/17592 PR: https://git.openjdk.org/jdk/pull/17592 From kbarrett at openjdk.org Mon Jan 29 12:49:33 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 29 Jan 2024 12:49:33 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 05:16:12 GMT, David Holmes wrote: > Part of the cleanup with using `nullptr` is that we should never need to cast `nullptr` to another pointer type. I flagged a few cases below but certainly not all. I think this may eventually be true, but I don't think it is currently true. I think we will need to add overloads for std::nullptr_t in some places in order to remove some casts. This is particularly true for templates, and I think all of the places you mentioned are of this kind. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17577#issuecomment-1914626715 From stefank at openjdk.org Mon Jan 29 12:52:30 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 29 Jan 2024 12:52:30 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v16] In-Reply-To: <8QYhYxEf3TYCwrrN430BZuaY4zYhK0sd2x8BTCaAojI=.96c15000-b013-41e6-8a94-5e3e6f95172c@github.com> References: <8QYhYxEf3TYCwrrN430BZuaY4zYhK0sd2x8BTCaAojI=.96c15000-b013-41e6-8a94-5e3e6f95172c@github.com> Message-ID: On Mon, 29 Jan 2024 06:42:46 GMT, Axel Boldt-Christmas wrote: >> Implements the runtime part of JDK-8319796. >> The different CPU implementations are/will be created as dependent pull requests. >> >> This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. >> >> A high level overview: >> * Locking is still performed on the mark word >> * Unlocked (0b01) <=> Locked (0b00) >> * Monitor enter on Obj with mark word Unlocked (0b01) is the same >> * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) >> * Push Obj onto the lock stack >> * Success >> * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack >> * If top entry is Obj >> * Push Obj on the lock stack >> * Success >> * If top entry is not Obj >> * Inflate and call ObjectMonitor::enter >> * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack >> * If just the top entry is Obj >> * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) >> * Pop the entry >> * Success >> * If both entries are Obj >> * Pop the top entry >> * Success >> * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit >> * If the monitor has been inflated for object Obj which is owned by the current thread >> * All corresponding entries for Obj is removed from the lock stack >> * The monitor recursions is set to the number of removed entries - 1 >> * The owner is changed from anonymous to the thread >> * The regular ObjectMonitor::action is called. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Fix assert comment Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16606#pullrequestreview-1848600707 From mbaesken at openjdk.org Mon Jan 29 13:09:52 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 29 Jan 2024 13:09:52 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v4] In-Reply-To: References: Message-ID: <2GQYwGAz_0A-KIRVevAJuI4bEzbvmjVJ_EXvYf48Xfo=.55ab5ad6-2bdd-41cf-9c85-c9f2a253730f@github.com> > Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. > > Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. > PhysicalMemory could be enhanced or a new event added. > > There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and > Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: use 4 * K ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17581/files - new: https://git.openjdk.org/jdk/pull/17581/files/a9114b1a..929207eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From eosterlund at openjdk.org Mon Jan 29 13:10:50 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 29 Jan 2024 13:10:50 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v4] In-Reply-To: References: Message-ID: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Erik ?sterlund has updated the pull request incrementally with eight additional commits since the last revision: - Batch allocate and free CompiledICData - JVMCI support - Cleanup from FYang - Axel suggestions - Suggestion from Axel - Use relevant global register aliases for clarity - Rename register aliases: holder -> data - Platform Comment Cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17495/files - new: https://git.openjdk.org/jdk/pull/17495/files/140a8a1e..54877772 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=02-03 Stats: 174 lines in 25 files changed: 25 ins; 112 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/17495.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17495/head:pull/17495 PR: https://git.openjdk.org/jdk/pull/17495 From eosterlund at openjdk.org Mon Jan 29 13:25:40 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 29 Jan 2024 13:25:40 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> References: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> Message-ID: <0qUYYua-ni7uWjsMme9Rw4jSI026yAHjOOeFhPFoPZs=.b90139a4-12e9-4118-92e9-c3a982b72b34@github.com> On Fri, 26 Jan 2024 15:59:35 GMT, Axel Boldt-Christmas wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Whitespace fix >> >> Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> > > Just did an initial read through of the PR. Just added some cleanup suggestion. Also noticed something I though looked wrong in the ARM32 port. > > I also went through and tried to find the handful of places in the codebase where the term `ICHolder` (or its derivatives) were still used. Put them in a separate branch to not clutter this PR. Would be nice to take this all the way and not have stale comments or naming lurking about. (Also nuked the `DECC` copy-paste-typo) > Comment cleanups: > f1bb02ea472eb314c93d80b830c59bd03e280116 > > All platforms use `data` as a register alias for the `CompileICData*` register in the `ic_check`. But c2i and itable stubs still use `holder`. Maybe go all the way here? > 5422ed32def491bd1e145959b7f3c49c88cfc50e > > Also for PPC and s390 I think the code is easier to understand if the global inline cache register aliases these platforms have are used. But maybe that is just me. > 39c0a7ede5187cba52d6fcf48c0852213c48c899 > > As for the implementation I could not see anything wrong (except the ARM32 port). But I'll leave it people with more expertise in this area. Thanks for the reviews! I applied the cleanups from @xmas92 and @RealFYang. I added the JVMCI hook for @dougxc and started bulk allocating and freeing CompiledICData to deal with the situation reported by @tschatzl. I haven't touched the ARM32 code though - waiting for @voitylov there. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1914687744 From mbaesken at openjdk.org Mon Jan 29 13:26:35 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 29 Jan 2024 13:26:35 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v2] In-Reply-To: <5uwBybPsGcTZi5GHG4Bu357AqGAn89-tzlRF3o8gW78=.957cde46-e3b2-4e9a-bd2e-0b6793c7326d@github.com> References: <5uwBybPsGcTZi5GHG4Bu357AqGAn89-tzlRF3o8gW78=.957cde46-e3b2-4e9a-bd2e-0b6793c7326d@github.com> Message-ID: On Mon, 29 Jan 2024 09:22:20 GMT, Lutz Schmidt wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> remove comment > > src/hotspot/os/aix/os_aix.cpp line 281: > >> 279: return -1; >> 280: } >> 281: return (jlong)(memory_info.pgsp_total * 4L * 1024L); > > please use 4 * K instead of 4L * 1024L. K is declared as size_t in globalDefinitions.hpp. Thanks Lucy, I replaced it. I noticed we have a similar code in jdk.management/unix/native/libmanagement_ext/OperatingSystemImpl.c should I replace this (probably in a separate change) ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1469587227 From dnsimon at openjdk.org Mon Jan 29 13:43:42 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 29 Jan 2024 13:43:42 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:10:50 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Erik ?sterlund has updated the pull request incrementally with eight additional commits since the last revision: > > - Batch allocate and free CompiledICData > - JVMCI support > - Cleanup from FYang > - Axel suggestions > - Suggestion from Axel > - Use relevant global register aliases for clarity > - Rename register aliases: holder -> data > - Platform Comment Cleanup src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1140: > 1138: } > 1139: > 1140: void MacroAssembler::align(int modulus, int target) { It would be nice to document what this extra `align` function does. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1469607284 From coleenp at openjdk.org Mon Jan 29 13:47:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 13:47:10 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v5] In-Reply-To: References: Message-ID: > This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. > > Ran tier1-4 testing. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - Fix some casts unnecessary with nullptr - Fix copyrights ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17593/files - new: https://git.openjdk.org/jdk/pull/17593/files/6eb051ed..6ac8aa85 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17593&range=03-04 Stats: 32 lines in 27 files changed: 0 ins; 0 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/17593.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17593/head:pull/17593 PR: https://git.openjdk.org/jdk/pull/17593 From coleenp at openjdk.org Mon Jan 29 13:50:46 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 13:50:46 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 04:58:25 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix nullptr in comments and strings to null. @kimbarrett changes. > > test/hotspot/gtest/metaspace/metaspaceGtestCommon.hpp line 148: > >> 146: >> 147: ////////////////////////////////////////////////////////// >> 148: // Some helpers to avoid typing out those annoying casts for nullptr > > Casts for nullptr should no longer be needed Like I said in the description, we can change the comments on a case-by-case basis in order to get the bulk of the change done mechanically. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469614981 From dnsimon at openjdk.org Mon Jan 29 13:50:50 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 29 Jan 2024 13:50:50 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:10:50 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Erik ?sterlund has updated the pull request incrementally with eight additional commits since the last revision: > > - Batch allocate and free CompiledICData > - JVMCI support > - Cleanup from FYang > - Axel suggestions > - Suggestion from Axel > - Use relevant global register aliases for clarity > - Rename register aliases: holder -> data > - Platform Comment Cleanup src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1004: > 1002: > 1003: int MacroAssembler::ic_check_size() { > 1004: return NativeInstruction::instruction_size * 7; This can be 5 or 7 depending on `MacroAssembler::far_jump` right? If it's 5, then who inserts the extra alignment at the end of the IC check? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1469616471 From coleenp at openjdk.org Mon Jan 29 13:50:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 13:50:47 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:46:49 GMT, Coleen Phillimore wrote: >> test/hotspot/gtest/metaspace/metaspaceGtestCommon.hpp line 148: >> >>> 146: >>> 147: ////////////////////////////////////////////////////////// >>> 148: // Some helpers to avoid typing out those annoying casts for nullptr >> >> Casts for nullptr should no longer be needed > > Like I said in the description, we can change the comments on a case-by-case basis in order to get the bulk of the change done mechanically. ie, we should file a separate RFE for comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469616729 From coleenp at openjdk.org Mon Jan 29 13:54:35 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 13:54:35 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:48:12 GMT, Coleen Phillimore wrote: >> Like I said in the description, we can change the comments on a case-by-case basis in order to get the bulk of the change done mechanically. > > ie, we should file a separate RFE for comments. And in this comment nullptr makes perfect sense. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1469623710 From tschatzl at openjdk.org Mon Jan 29 13:57:56 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Jan 2024 13:57:56 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 Message-ID: Hi all, after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: ``` [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 ``` This change adds the necessary includes. Still compiling the change.... Thomas ------------- Commit messages: - fix compilation Changes: https://git.openjdk.org/jdk/pull/17613/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17613&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324840 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17613.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17613/head:pull/17613 PR: https://git.openjdk.org/jdk/pull/17613 From epeter at openjdk.org Mon Jan 29 13:57:56 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 29 Jan 2024 13:57:56 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 In-Reply-To: References: Message-ID: <4sLEAwetEkLIqOur6zjFO6zSSNrhbip2-pGFN7UC7Lc=.a9eccc2b-c42a-47e0-9bb3-736bbdac9c1f@github.com> On Mon, 29 Jan 2024 13:12:42 GMT, Thomas Schatzl wrote: > Hi all, > > after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: > > ``` > [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed > [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 > [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed > [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 > [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed > [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 > ``` > > This change adds the necessary includes. > > Still compiling the change.... > > Thomas Thanks for the fix! ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17613#pullrequestreview-1848698677 From tschatzl at openjdk.org Mon Jan 29 13:57:56 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Jan 2024 13:57:56 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:12:42 GMT, Thomas Schatzl wrote: > Hi all, > > after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: > > ``` > [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed > [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 > [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed > [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 > [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed > [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 > ``` > > This change adds the necessary includes. > > Still compiling the change.... > > Thomas Thanks for your review. Since after this change the build seems to complete, I'll push this right away. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17613#issuecomment-1914743284 From tschatzl at openjdk.org Mon Jan 29 13:57:56 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Jan 2024 13:57:56 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:12:42 GMT, Thomas Schatzl wrote: > Hi all, > > after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: > > ``` > [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed > [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 > [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed > [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 > [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed > [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 > ``` > > This change adds the necessary includes. > > Still compiling the change.... > > Thomas This pull request has now been integrated. Changeset: fe0eec7e Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/fe0eec7e20bc4c39d6c2b58d81ffd5c0ef1fdeda Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 Reviewed-by: epeter ------------- PR: https://git.openjdk.org/jdk/pull/17613 From coleenp at openjdk.org Mon Jan 29 14:02:50 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 14:02:50 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v3] In-Reply-To: References: Message-ID: > If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. > Tested with gtest, Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - Removed some casts unneeded by nullptr change, and fixed one comment - Removed some casts unneeded by nullptr change, and fixed one comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17577/files - new: https://git.openjdk.org/jdk/pull/17577/files/d8e3d92d..cf30de3d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17577&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17577&range=01-02 Stats: 25 lines in 20 files changed: 0 ins; 0 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/17577.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17577/head:pull/17577 PR: https://git.openjdk.org/jdk/pull/17577 From jwaters at openjdk.org Mon Jan 29 14:07:35 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 29 Jan 2024 14:07:35 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:12:42 GMT, Thomas Schatzl wrote: > Hi all, > > after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: > > ``` > [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed > [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 > [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed > [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 > [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found > [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed > [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 > ``` > > This change adds the necessary includes. > > Still compiling the change.... > > Thomas Shouldn't this have used strtok_s? https://github.com/openjdk/jdk/blob/fe0eec7e20bc4c39d6c2b58d81ffd5c0ef1fdeda/src/hotspot/share/runtime/os.hpp#L1014 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17613#issuecomment-1914761008 From coleenp at openjdk.org Mon Jan 29 14:16:48 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 14:16:48 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v3] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 14:02:50 GMT, Coleen Phillimore wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Removed some casts unneeded by nullptr change, and fixed one comment > - Removed some casts unneeded by nullptr change, and fixed one comment Thanks Julian, Kim and David for the reviews. I fixed the null casts that you pointed out and the one comment, and fixed my copyright script (so updated the copyrights). I'll wait for GHA to integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17577#issuecomment-1914778925 From aboldtch at openjdk.org Mon Jan 29 14:16:51 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Jan 2024 14:16:51 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: References: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> Message-ID: On Fri, 26 Jan 2024 18:08:33 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Update variable names in ad files >> - Preload markWord unconditionally >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 11 more: https://git.openjdk.org/jdk/compare/f944f1ec...4d37c4b7 > > src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 141: > >> 139: >> 140: __ bind(fix_zf_and_unlocked); >> 141: __ xorl(rax, rax); > > Just curious: why use `xorl` here and `xorptr` on L135 above? Legacy. `xorl` is what `fast_unlock` used to set the zero flag. `xorl` is there only to set the zero flag. `xorptr` is there to store `nullptr` in `rax` (the expected value for the CAS). Maybe it doesn't matter if x64 used `xorq` to set the zero flag. There are many other ways to do it as well. `fast_unlock` also use `testl` with a 0 immediate in one place. If there is a recommended way to set the zero flag we can change it to that. I could not find a definitive answer to what is best to use. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1469652978 From epeter at openjdk.org Mon Jan 29 14:16:55 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 29 Jan 2024 14:16:55 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 In-Reply-To: References: Message-ID: <6Jdrz-dA54FUe7yEdiQ0gMD_KTV7oUujwA1mdTEb2nA=.2bf475aa-0fb0-4938-bd3e-9c047546455a@github.com> On Mon, 29 Jan 2024 14:04:19 GMT, Julian Waters wrote: >> Hi all, >> >> after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: >> >> ``` >> [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) >> [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed >> [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 >> [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... >> [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed >> [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 >> [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... >> [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed >> [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 >> ``` >> >> This change adds the necessary includes. >> >> Still compiling the change.... >> >> Thomas > > Shouldn't this have used strtok_s? > https://github.com/openjdk/jdk/blob/fe0eec7e20bc4c39d6c2b58d81ffd5c0ef1fdeda/src/hotspot/share/runtime/os.hpp#L1014 @TheShermanTanker maybe so, have not looked into that. I only moved code in [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572), which led to include issues here. @TheShermanTanker does the code you quote not exactly make sure that if we write `strtok_r` on windows, that we actually use `strtok_s`, and so everything is ok? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17613#issuecomment-1914773536 PR Comment: https://git.openjdk.org/jdk/pull/17613#issuecomment-1914778290 From coleenp at openjdk.org Mon Jan 29 14:26:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 14:26:38 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v5] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:47:10 GMT, Coleen Phillimore wrote: >> This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. >> >> Ran tier1-4 testing. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Fix some casts unnecessary with nullptr > - Fix copyrights Thanks Kevin, Kim and David for wading through this. If there are other changes we can address them separately preserving your eyeballs. My copyright script was broken so I fixed it. I'll wait for GHA to make sure I didn't break anything before integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17593#issuecomment-1914798074 From aboldtch at openjdk.org Mon Jan 29 14:37:51 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Jan 2024 14:37:51 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: References: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> Message-ID: On Fri, 26 Jan 2024 21:01:54 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Update variable names in ad files >> - Preload markWord unconditionally >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 11 more: https://git.openjdk.org/jdk/compare/3ef03486...4d37c4b7 > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1061: > >> 1059: stub = new (Compile::current()->comp_arena()) C2FastUnlockLightweightStub(obj, mark, reg_rax, thread); >> 1060: Compile::current()->output()->add_stub(stub); >> 1061: } > > So what happens if `stub` doesn't get generated? When `in_scratch_emit_size` is true we are only emitting to calculate the size of the code. So only dummy labels are used for jumps and no stub is allocated nor registered. Later when the final emission occurs (`in_scratch_emit_size` is false) the stub is allocated and registered (so that it can be emitted later with the stubs) and the correct labels are used (and bound). Stubs are always generated for the final emission. As for the stub failing to get allocated, I believe that the arena allocator uses `AllocFailStrategy::EXIT_OOM`, unsure what happens when the compiler arena is exhausted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1469681630 From tschatzl at openjdk.org Mon Jan 29 14:40:39 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Jan 2024 14:40:39 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 12:42:46 GMT, Albert Mingkun Yang wrote: >> Simple obsoleting four related deprecated jvm flags. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > year Lgtm. One superfluous additional newline. src/hotspot/share/runtime/arguments.cpp line 497: > 495: // -------------- Deprecated Flags -------------- > 496: // --- Non-alias flags - sorted by obsolete_in then expired_in: > 497: Suggestion: Seems unnecessary to introduce a newline here. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17592#pullrequestreview-1848831102 PR Review Comment: https://git.openjdk.org/jdk/pull/17592#discussion_r1469686840 From aboldtch at openjdk.org Mon Jan 29 14:40:50 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Jan 2024 14:40:50 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: References: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> Message-ID: On Fri, 26 Jan 2024 21:23:40 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Update variable names in ad files >> - Preload markWord unconditionally >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Add more expressive stub continuation names >> - Remove outdated anonymous owner fix in stub >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - Remove C2HandleAnonOMOwnerStub definitions on x86. >> - Add MFENCE comment >> - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 >> - ... and 11 more: https://git.openjdk.org/jdk/compare/c61122da...4d37c4b7 > > src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 142: > >> 140: #else >> 141: // This relies on the implementation of lightweight_unlock knowing that it >> 142: // will clobber its thread when using EAX. > > This use of `EAX` is confusing when earlier in this function `rax` is used. Hm, I thought RAX was confusing when used in x86_32 code. But can try to clarify. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1469687375 From aboldtch at openjdk.org Mon Jan 29 14:50:57 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Jan 2024 14:50:57 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v15] In-Reply-To: References: Message-ID: > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Update MacroAssembler::lightweight_unlock comments. - Update comments and asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/4d37c4b7..d845e31d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=13-14 Stats: 14 lines in 5 files changed: 3 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From aboldtch at openjdk.org Mon Jan 29 14:50:57 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Jan 2024 14:50:57 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v14] In-Reply-To: References: <-ARL6dSZSoFCjrdSJpaqyAQcuoq0Rc9vKbUYkE-XeD4=.0fcd9ea4-0d4b-4b8f-8cae-20d41a8a3867@github.com> Message-ID: On Mon, 29 Jan 2024 14:38:02 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 142: >> >>> 140: #else >>> 141: // This relies on the implementation of lightweight_unlock knowing that it >>> 142: // will clobber its thread when using EAX. >> >> This use of `EAX` is confusing when earlier in this function `rax` is used. > > Hm, I thought RAX was confusing when used in x86_32 code. But can try to clarify. Tried to reword and restructure the comments surrounding this a bit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1469702789 From jwaters at openjdk.org Mon Jan 29 15:04:53 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 29 Jan 2024 15:04:53 GMT Subject: Integrated: 8324840: windows-x64-slowdebug does not build anymore after JDK-8317572 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 14:04:19 GMT, Julian Waters wrote: >> Hi all, >> >> after [JDK-8317572](https://bugs.openjdk.org/browse/JDK-8317572) windows-x64-slowdebug does not build any more with teh following error: >> >> ``` >> [2024-01-29T12:19:47,671Z] Creating support/native/java.desktop/libjsound/static/jsound.lib from 17 file(s) >> [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:24:59,378Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:24:59,378Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj' failed >> [2024-01-29T12:24:59,378Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/static/logLevel.obj] Error 1 >> [2024-01-29T12:24:59,378Z] make[3]: *** Waiting for unfinished jobs.... >> [2024-01-29T12:25:02,568Z] make/Main.gmk:261: recipe for target 'hotspot-server-static-libs' failed >> [2024-01-29T12:25:02,568Z] make[2]: *** [hotspot-server-static-libs] Error 2 >> [2024-01-29T12:25:02,568Z] make[2]: *** Waiting for unfinished jobs.... >> [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(65): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:25:03,412Z] ...\open\src\hotspot\share\utilities/stringUtils.hpp(73): error C3861: 'strtok_r': identifier not found >> [2024-01-29T12:25:03,412Z] lib/CompileJvm.gmk:152: recipe for target '.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj' failed >> [2024-01-29T12:25:03,412Z] make[3]: *** [.../windows-x64-slowdebug/hotspot/variant-server/libjvm/objs/logLevel.obj] Error 1 >> ``` >> >> This change adds the necessary includes. >> >> Still compiling the change.... >> >> Thomas > > Shouldn't this have used strtok_s? > https://github.com/openjdk/jdk/blob/fe0eec7e20bc4c39d6c2b58d81ffd5c0ef1fdeda/src/hotspot/share/runtime/os.hpp#L1014 > @TheShermanTanker does the code you quote not exactly make sure that if we write `strtok_r` on windows, that we actually use `strtok_s`, and so everything is ok? >From what I can see all it does is alias strtok_r calls to strtok_s yes ------------- PR Comment: https://git.openjdk.org/jdk/pull/17613#issuecomment-1914874582 From jwaters at openjdk.org Mon Jan 29 15:09:44 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 29 Jan 2024 15:09:44 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v3] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 14:02:50 GMT, Coleen Phillimore wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Removed some casts unneeded by nullptr change, and fixed one comment > - Removed some casts unneeded by nullptr change, and fixed one comment Marked as reviewed by jwaters (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17577#pullrequestreview-1848906061 From mbaesken at openjdk.org Mon Jan 29 15:16:48 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 29 Jan 2024 15:16:48 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v5] In-Reply-To: References: Message-ID: > Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. > > Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. > PhysicalMemory could be enhanced or a new event added. > > There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and > Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Linux: container support in os::total_swap_space ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17581/files - new: https://git.openjdk.org/jdk/pull/17581/files/929207eb..e3bcb12a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=03-04 Stats: 12 lines in 1 file changed: 6 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From ayang at openjdk.org Mon Jan 29 15:19:06 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 Jan 2024 15:19:06 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags [v3] In-Reply-To: References: Message-ID: <9OkBLuBpwhNQVY4Acs2mGBWg8gb3WRnuHW9jXJ0BeLY=.e148d2a6-f07d-4da5-b44a-b170b9bd833b@github.com> > Simple obsoleting four related deprecated jvm flags. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/runtime/arguments.cpp Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17592/files - new: https://git.openjdk.org/jdk/pull/17592/files/ee96cc7a..7d421d60 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17592&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17592&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17592.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17592/head:pull/17592 PR: https://git.openjdk.org/jdk/pull/17592 From sgehwolf at openjdk.org Mon Jan 29 15:23:45 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 29 Jan 2024 15:23:45 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v5] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 15:16:48 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Linux: container support in os::total_swap_space I think this should also go to `hotspot-jfr-dev` (using `/label add`). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17581#issuecomment-1914918287 From sgehwolf at openjdk.org Mon Jan 29 15:37:44 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 29 Jan 2024 15:37:44 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v5] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 15:16:48 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Linux: container support in os::total_swap_space src/hotspot/os/linux/os_linux.cpp line 294: > 292: jlong os::total_swap_space() { > 293: if (OSContainer::is_containerized()) { > 294: return (jlong)(OSContainer::memory_and_swap_limit_in_bytes() - OSContainer::memory_limit_in_bytes()); Shouldn't we check if `OSContainer::memory_limit_in_bytes() > 0` here first? src/hotspot/os/linux/os_linux.cpp line 313: > 311: } > 312: return (jlong)(si.freeswap * si.mem_unit); > 313: } In a containerized environment with some memory limit this could potentially return a large value for `free_swap_space()`, and a small(er) value for `total_swap_space()`. i.e. `total_swap_space() < free_swap_space()`. Please return `-1` if the containerized value is not supported. Better yet, push the implementation to `Linux::free_swap_space()` and `Linux::total_swap_space()` which always returns host swap values and do the (container) wrappers here in `os::free_swap_space()` and `os::total_swap_space()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1469776435 PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1469773728 From coleenp at openjdk.org Mon Jan 29 17:10:42 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 17:10:42 GMT Subject: RFR: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files [v5] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:47:10 GMT, Coleen Phillimore wrote: >> This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. >> >> Ran tier1-4 testing. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Fix some casts unnecessary with nullptr > - Fix copyrights macos-aarch64 build failure in GHA appears unrelated, internal testing passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17593#issuecomment-1915186403 From coleenp at openjdk.org Mon Jan 29 17:10:43 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 17:10:43 GMT Subject: Integrated: 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 16:40:32 GMT, Coleen Phillimore wrote: > This mechanically replaces NULL with nullptr in hpp/cpp native files in test native code. This didn't attempt to change NULL in comments to say null because nullptr is generally the right thing for the comment to say. It does attempt to change NULL to "null" rather than "nullptr" in strings. Any changes for "nullptr" to "null" in comments can be changed in a future RFE in a smaller patch. I didn't see any when it was scrolling by to make my script more complicated. > > Ran tier1-4 testing. This pull request has now been integrated. Changeset: a6bdee48 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/a6bdee48f39993128d8095d40ab417f0102af0f4 Stats: 8218 lines in 750 files changed: 0 ins; 7 del; 8211 mod 8324681: Replace NULL with nullptr in HotSpot jtreg test native code files Reviewed-by: kevinw, kbarrett, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17593 From coleenp at openjdk.org Mon Jan 29 17:15:51 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 17:15:51 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v3] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 14:02:50 GMT, Coleen Phillimore wrote: >> If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. >> Tested with gtest, > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Removed some casts unneeded by nullptr change, and fixed one comment > - Removed some casts unneeded by nullptr change, and fixed one comment Thanks also for reviewing, Julian. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17577#issuecomment-1915193394 From coleenp at openjdk.org Mon Jan 29 17:15:52 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 17:15:52 GMT Subject: Integrated: 8324678: Replace NULL with nullptr in HotSpot gtests In-Reply-To: References: Message-ID: <4kCNphY2XNIVEV554EEM6Jm8DcrnPf5Nxw2xpgLFmJ4=.26357974-8f4e-4667-942c-daea3db044e9@github.com> On Thu, 25 Jan 2024 21:35:29 GMT, Coleen Phillimore wrote: > If this is sufficient, here's the change for NULL to nullptr, adjusting some obvious strings that had NULL in them maybe not all. > Tested with gtest, This pull request has now been integrated. Changeset: c1281e6b Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/c1281e6b45ed167df69d29a6039d81854c145ae6 Stats: 580 lines in 74 files changed: 0 ins; 0 del; 580 mod 8324678: Replace NULL with nullptr in HotSpot gtests Reviewed-by: kbarrett, dholmes, jwaters ------------- PR: https://git.openjdk.org/jdk/pull/17577 From coleenp at openjdk.org Mon Jan 29 17:20:46 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 29 Jan 2024 17:20:46 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. It seems to me that we should just change the default to 'yield' and let other platforms determine the best tuning, since this might be the cause of aarch64 regression in JDK-8324221. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1915206782 From aph at openjdk.org Mon Jan 29 17:27:44 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 29 Jan 2024 17:27:44 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 17:18:14 GMT, Coleen Phillimore wrote: > 8324221. Arm tell us that 'yield' is basically implemented as a nop. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1915218158 From kvn at openjdk.org Mon Jan 29 17:58:41 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Jan 2024 17:58:41 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 04:25:36 GMT, David Holmes wrote: > > The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. > > Just to clarify my understanding here, when we iterate the list of monitors, the `BasicObjectLock` is different depending on whether the object is initially locked or recursively locked - is that the case? Correct. Recursive lock has different `BasicObjectLock`: Recursive: frame #0: 0x0000000104241e6c libjvm.dylib`BasicObjectLock::obj(this=0x000060000218ded0) const at basicLock.hpp:72:64 Initial: frame #0: 0x0000000104241e6c libjvm.dylib`BasicObjectLock::obj(this=0x000060000218deb0) const at basicLock.hpp:72:64 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17600#issuecomment-1915269219 From lmesnik at openjdk.org Mon Jan 29 18:39:03 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 29 Jan 2024 18:39:03 GMT Subject: RFR: 8324861: Exceptions::wrap_dynamic_exception() don't have ResourceMark Message-ID: The issue is reproduced with make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all TEST=runtime/ConstantPool/TestMethodHandleConstant.java TEST_VM_OPTS="-Xlog:all=trace:file=vm.%p.log" verified that it doesn't crash anymore. Also, run tier1 for sanity testing. ------------- Commit messages: - 8324861 Changes: https://git.openjdk.org/jdk/pull/17620/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17620&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324861 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17620.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17620/head:pull/17620 PR: https://git.openjdk.org/jdk/pull/17620 From shade at openjdk.org Mon Jan 29 20:27:47 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Jan 2024 20:27:47 GMT Subject: RFR: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 09:42:52 GMT, Aleksey Shipilev wrote: >> We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. >> >> There are a few interesting conversions along the way: >> 1. `intptr_t` -> `uint32_t` (this method) >> 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) >> 3. `int32_t` -> `uint32_t` (in `emit_int32`) >> >> I believe these are safe after `is_uimm32` check, but please check (sic) me on this. >> >> Note that x86_64 matcher already does similar thing for immediates: >> >> >> // Long Immediate 32-bit unsigned >> operand immUL32() >> %{ >> predicate(n->get_long() == (unsigned int) (n->get_long())); >> match(ConL); >> ... >> %} >> >> instruct loadConUL32(rRegL dst, immUL32 src) >> %{ >> ... >> format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} >> ins_encode %{ >> __ movl($dst$$Register, $src$$constant); >> %} >> %} >> >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` >> >> Code sizes for `Hello World`, `-Xcomp`: >> >> >> # Before >> tier1 nmethod code size : 426208 bytes >> tier2 nmethod code size : 462880 bytes >> tier3 nmethod code size : 889992 bytes >> tier4 nmethod code size : 1244448 bytes >> >> # After >> tier1 nmethod code size : 425768 bytes (-0.1%) >> tier2 nmethod code size : 462400 bytes (-0.1%) >> tier3 nmethod code size : 882072 bytes (-0.8%) >> tier4 nmethod code size : 1236448 bytes (-0.6%) > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into JDK-8323503-x86-movptr-unsigned > - Revert "Just do checked_cast" > > This reverts commit 3f94218b46b6b0492ffcc24404b7bb5546b3318a. > - Just do checked_cast > - Fix All right; I also run a few stressy things here, and I think we are fine. Let's see if it breaks anything else. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17343#issuecomment-1915501387 From shade at openjdk.org Mon Jan 29 20:27:47 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Jan 2024 20:27:47 GMT Subject: Integrated: 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates In-Reply-To: References: Message-ID: <-ZMdxsMXAcURSy4GpclOHcg3D1FIg0Eg-HzXqd0wBsw=.97a7fc98-8084-4972-81f6-fe2d4ed0b445@github.com> On Wed, 10 Jan 2024 11:05:03 GMT, Aleksey Shipilev wrote: > We noticed in [JDK-8323497](https://bugs.openjdk.org/browse/JDK-8323497) that `movptr` optimization done in [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) is not covering the case of immediates that fit in 32-bit unsigned, but do not fit in 32-bit signed. In that case, we can maybe do `mov r32, imm32` and rely on x86 zero-extending 32->64 bit for us. Since `movl` encoding is smaller than sign-extending `movq`, we also save more code on most paths that [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) improved. > > There are a few interesting conversions along the way: > 1. `intptr_t` -> `uint32_t` (this method) > 2. `uint32_t` -> `int32_t` (argument conversion for `movl`) > 3. `int32_t` -> `uint32_t` (in `emit_int32`) > > I believe these are safe after `is_uimm32` check, but please check (sic) me on this. > > Note that x86_64 matcher already does similar thing for immediates: > > > // Long Immediate 32-bit unsigned > operand immUL32() > %{ > predicate(n->get_long() == (unsigned int) (n->get_long())); > match(ConL); > ... > %} > > instruct loadConUL32(rRegL dst, immUL32 src) > %{ > ... > format %{ "movl $dst, $src\t# long (unsigned 32-bit)" %} > ins_encode %{ > __ movl($dst$$Register, $src$$constant); > %} > %} > > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier{1,2,3,4}` > > Code sizes for `Hello World`, `-Xcomp`: > > > # Before > tier1 nmethod code size : 426208 bytes > tier2 nmethod code size : 462880 bytes > tier3 nmethod code size : 889992 bytes > tier4 nmethod code size : 1244448 bytes > > # After > tier1 nmethod code size : 425768 bytes (-0.1%) > tier2 nmethod code size : 462400 bytes (-0.1%) > tier3 nmethod code size : 882072 bytes (-0.8%) > tier4 nmethod code size : 1236448 bytes (-0.6%) This pull request has now been integrated. Changeset: e999dfcb Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/e999dfcb405962bc4d77b9740d36193f1ebe4a2c Stats: 4 lines in 2 files changed: 3 ins; 0 del; 1 mod 8323503: x86: Shorter movptr(reg, imm) for 32-bit unsigned immediates Reviewed-by: stuefe, kvn, eastigeevich ------------- PR: https://git.openjdk.org/jdk/pull/17343 From duke at openjdk.org Mon Jan 29 21:36:42 2024 From: duke at openjdk.org (duke) Date: Mon, 29 Jan 2024 21:36:42 GMT Subject: Withdrawn: 8319709: Make GrowableArrayCHeap copyable In-Reply-To: <2SEJ0Rh7DNmKgcylAW7_DFxas2Bs3YzTnUSe39OIVsI=.03298520-694f-4ba7-bdce-d1e67eb3872e@github.com> References: <2SEJ0Rh7DNmKgcylAW7_DFxas2Bs3YzTnUSe39OIVsI=.03298520-694f-4ba7-bdce-d1e67eb3872e@github.com> Message-ID: On Wed, 8 Nov 2023 13:25:00 GMT, Johan Sj?len wrote: > Hi, > > Please consider this code change which makes `GrowableArrayCHeap` copyable. The resulting copy does not share its data array with the original. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16559 From kvn at openjdk.org Mon Jan 29 22:42:52 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Jan 2024 22:42:52 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 Message-ID: [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. Tested tier1 and tier2 builds with precompiled headers off. ------------- Commit messages: - 8324865: windows-x64-slowdebug still does not build after JDK-8324840 Changes: https://git.openjdk.org/jdk/pull/17622/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17622&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324865 Stats: 27 lines in 2 files changed: 14 ins; 11 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17622.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17622/head:pull/17622 PR: https://git.openjdk.org/jdk/pull/17622 From dholmes at openjdk.org Mon Jan 29 22:51:41 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 29 Jan 2024 22:51:41 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 22:38:27 GMT, Vladimir Kozlov wrote: > [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . > > Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". > I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. > > Tested tier1 and tier2 builds with precompiled headers off. I think the "right" fix here would be to define `os::strtok_r` which calls `strtok_s` on Windows and `strtok_r` elsewhere. The quickest/simplest fix would be to just add this to `stringUtils.hpp`: #ifdef _WINDOWS // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r # define strtok_r strtok_s #endif ------------- PR Comment: https://git.openjdk.org/jdk/pull/17622#issuecomment-1915710214 From kvn at openjdk.org Mon Jan 29 22:58:42 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Jan 2024 22:58:42 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 In-Reply-To: References: Message-ID: <1xUdnTdNFVm1sn7Q8WWPbsZ5yMClH5IlEADFpRZuJO8=.0e4d82b0-0c32-47c8-8257-b3dfa8a6b0a9@github.com> On Mon, 29 Jan 2024 22:48:39 GMT, David Holmes wrote: >> [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . >> >> Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". >> I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. >> >> Tested tier1 and tier2 builds with precompiled headers off. > > I think the "right" fix here would be to define `os::strtok_r` which calls `strtok_s` on Windows and `strtok_r` elsewhere. > > The quickest/simplest fix would be to just add this to `stringUtils.hpp`: > > #ifdef _WINDOWS > // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r > # define strtok_r strtok_s > #endif Thank you @dholmes-ora. Yes, simplest solution is copy definition from os.hpp. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17622#issuecomment-1915719210 From kvn at openjdk.org Mon Jan 29 23:52:54 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Jan 2024 23:52:54 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: References: Message-ID: > [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . > > Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". > I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. > > Tested tier1 and tier2 builds with precompiled headers off. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Update fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17622/files - new: https://git.openjdk.org/jdk/pull/17622/files/e3b346ff..e22c1f0a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17622&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17622&range=00-01 Stats: 30 lines in 2 files changed: 14 ins; 14 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17622.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17622/head:pull/17622 PR: https://git.openjdk.org/jdk/pull/17622 From kvn at openjdk.org Mon Jan 29 23:52:54 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Jan 2024 23:52:54 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 22:48:39 GMT, David Holmes wrote: >> [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . >> >> Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". >> I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. >> >> Tested tier1 and tier2 builds with precompiled headers off. > > I think the "right" fix here would be to define `os::strtok_r` which calls `strtok_s` on Windows and `strtok_r` elsewhere. > > The quickest/simplest fix would be to just add this to `stringUtils.hpp`: > > #ifdef _WINDOWS > // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r > # define strtok_r strtok_s > #endif @dholmes-ora, I used your suggestion and tested it (passed). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17622#issuecomment-1915777540 From dholmes at openjdk.org Tue Jan 30 00:37:25 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Jan 2024 00:37:25 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 23:52:54 GMT, Vladimir Kozlov wrote: >> [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . >> >> Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". >> I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. >> >> Tested tier1 and tier2 builds with precompiled headers off. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update fix One nit else good. Thanks src/hotspot/share/utilities/stringUtils.hpp line 31: > 29: > 30: #ifdef _WINDOWS > 31: // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r Nit: indent needs fixing ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17622#pullrequestreview-1850025506 PR Review Comment: https://git.openjdk.org/jdk/pull/17622#discussion_r1470419183 From dcubed at openjdk.org Tue Jan 30 00:58:31 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 30 Jan 2024 00:58:31 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 23:52:54 GMT, Vladimir Kozlov wrote: >> [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . >> >> Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". >> I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. >> >> Tested tier1 and tier2 builds with precompiled headers off. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update fix Thumbs up. Since this is a build fix, please integrate before 24 hours have passed. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17622#pullrequestreview-1850042442 From dcubed at openjdk.org Tue Jan 30 00:58:32 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 30 Jan 2024 00:58:32 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 00:34:04 GMT, David Holmes wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update fix > > src/hotspot/share/utilities/stringUtils.hpp line 31: > >> 29: >> 30: #ifdef _WINDOWS >> 31: // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r > > Nit: indent needs fixing I'm not seeing the indent problem... Maybe my eyes are tired... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17622#discussion_r1470437366 From kvn at openjdk.org Tue Jan 30 01:11:42 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 01:11:42 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: References: Message-ID: <4MrTUHEH-DI26zqp-_dD3_JJcgo8OeKdMUAg1uvRFZM=.162a7546-7873-4f4c-9294-6fd1ea825d4a@github.com> On Tue, 30 Jan 2024 00:34:04 GMT, David Holmes wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update fix > > src/hotspot/share/utilities/stringUtils.hpp line 31: > >> 29: >> 30: #ifdef _WINDOWS >> 31: // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r > > Nit: indent needs fixing @dholmes-ora Do you mean to remove 2 spaces before `//`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17622#discussion_r1470444888 From kvn at openjdk.org Tue Jan 30 01:11:42 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 01:11:42 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: <4MrTUHEH-DI26zqp-_dD3_JJcgo8OeKdMUAg1uvRFZM=.162a7546-7873-4f4c-9294-6fd1ea825d4a@github.com> References: <4MrTUHEH-DI26zqp-_dD3_JJcgo8OeKdMUAg1uvRFZM=.162a7546-7873-4f4c-9294-6fd1ea825d4a@github.com> Message-ID: On Tue, 30 Jan 2024 01:03:37 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/utilities/stringUtils.hpp line 31: >> >>> 29: >>> 30: #ifdef _WINDOWS >>> 31: // strtok_s is the Windows thread-safe equivalent of POSIX strtok_r >> >> Nit: indent needs fixing > > @dholmes-ora Do you mean to remove 2 spaces before `//`? This code is copy-pasted from os.hpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17622#discussion_r1470446421 From kvn at openjdk.org Tue Jan 30 01:11:42 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 01:11:42 GMT Subject: RFR: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 [v2] In-Reply-To: References: <4MrTUHEH-DI26zqp-_dD3_JJcgo8OeKdMUAg1uvRFZM=.162a7546-7873-4f4c-9294-6fd1ea825d4a@github.com> Message-ID: On Tue, 30 Jan 2024 01:05:36 GMT, Vladimir Kozlov wrote: >> @dholmes-ora Do you mean to remove 2 spaces before `//`? > > This code is copy-pasted from os.hpp. I am pushing it as it is since I got 2 reviews. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17622#discussion_r1470447464 From kvn at openjdk.org Tue Jan 30 01:11:43 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 01:11:43 GMT Subject: Integrated: 8324865: windows-x64-slowdebug still does not build after JDK-8324840 In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 22:38:27 GMT, Vladimir Kozlov wrote: > [~jwaters] commented in the PR [#17613](https://github.com/openjdk/jdk/pull/17613) that on windows we have strtok_s instead of strtok_r. We may need to #include "runtime/os.hpp" which defined strtok_r for windows instead of . > > Including os.hpp will increase compilation time for all files which inclide "stringUtils.hpp". > I suggest to move code from .hpp to .cpp file. It is used only by c2 and inlining of this code is not too important. > > Tested tier1 and tier2 builds with precompiled headers off. This pull request has now been integrated. Changeset: b6d364ad Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/b6d364ad88ca0e554a47ef7daba03bb07fd95b01 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod 8324865: windows-x64-slowdebug still does not build after JDK-8324840 Reviewed-by: dholmes, dcubed ------------- PR: https://git.openjdk.org/jdk/pull/17622 From dlong at openjdk.org Tue Jan 30 02:11:40 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 30 Jan 2024 02:11:40 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Fri, 26 Jan 2024 21:34:44 GMT, Kevin Walls wrote: >> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. >> When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. >> >> There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > ThreadsListHandle required for Handshake My understanding is that monitor chunks are temporary native heap storage for BasicObjectLock records that are being moved from compiled frames to interpreter frames. So the answer is, they will be found in the new interpreter frames that deoptimization pushes, assuming the monitors are not inflated in the process. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1915933365 From jiangli at openjdk.org Tue Jan 30 04:23:41 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 30 Jan 2024 04:23:41 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 09:42:20 GMT, Andrew Haley wrote: > > Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like `objcopy` can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that? > > I suppose so, but why? > > Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there. We ran into issues with older gcc on linux-aarch for partial linking, but the problem may not be older gcc only(?). At the current stage, limiting static/hermetic Java runtime support to only the platforms that support partial linking and `objcopy` seems to be overly restrictive (it does simplify the requirements significantly however :-)): - The duplicate symbol problems are mostly found in JDK natives and have been resolved already. We've found very few symbol issues with hotspot code so far. As there are portable alternative solutions that can resolve the symbol issues in hotspot, choosing a less portable solution seems not too attractive currently. - As we haven't found many duplicate symbol issues with hotspot code, resolving them case by case may still be a good choice. We don't have to tie into any permanent solution during the early stage. - Based on what we learned from the static/hermetic Java prototyping and investigations, majority of the work is non-os and non-cpu specific. If we can carefully handle the platform specific part with portable solution(s), we can support static/hermetic Java for different supported platforms as a more general solution. Those are my reasonings. :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1916047210 From dholmes at openjdk.org Tue Jan 30 04:36:41 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Jan 2024 04:36:41 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> On Fri, 26 Jan 2024 21:34:44 GMT, Kevin Walls wrote: >> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. >> When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. >> >> There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > ThreadsListHandle required for Handshake Okay so anything looking at monitor_chunks is looking at a moving target. They have no idea what stage of moving from compiled to interpreted frames has been reached. So examining monitor_chunks just seems inherently unsafe and totally misguided. On the other hand if you want to know about all monitors then you need to know whether this deopt is in progress or not, and prevent it from starting or wait for it to finish. But I also don't see how we examine monitors that are still in compiled frames? `is_lock_owned` does not consider them. ??? This seems completely broken. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1916058542 From eosterlund at openjdk.org Tue Jan 30 06:37:31 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 Jan 2024 06:37:31 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 17:24:50 GMT, Andrew Haley wrote: > > 8324221. > > > > Arm tell us that 'yield' is basically implemented as a nop. It is not doing much in their current designs, indeed. That's their (questionable?) implementation choice. New AmpereOne chips do however implement yield. Since the ISA interface gives us a yield instruction dedicated for this and at least one new chip implements it, it makes sense to me that it is the *default* instruction, rather than none, going forward. Then we can continue to recognize chips that didn't implement yield and try to figure out how to deal with that awkwardness separately, and hope that vendors (indeed including ARM), start implementing the ISA, instead of having us doubling down on horrible hacks that try to quack like a yield instruction. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1916168950 From dholmes at openjdk.org Tue Jan 30 07:15:42 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Jan 2024 07:15:42 GMT Subject: RFR: 8324678: Replace NULL with nullptr in HotSpot gtests [v2] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:51:50 GMT, Coleen Phillimore wrote: >> ie, we should file a separate RFE for comments. > > And in this comment nullptr makes perfect sense. Yeah I wasn't commenting on the use of nullptr in the comment :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17577#discussion_r1470686744 From dholmes at openjdk.org Tue Jan 30 07:35:40 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Jan 2024 07:35:40 GMT Subject: RFR: 8324861: Exceptions::wrap_dynamic_exception() doesn't have ResourceMark In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 18:34:02 GMT, Leonid Mesnik wrote: > The issue is reproduced with > make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all TEST=runtime/ConstantPool/TestMethodHandleConstant.java TEST_VM_OPTS="-Xlog:all=trace:file=vm.%p.log" > > verified that it doesn't crash anymore. Also, run tier1 for sanity testing. Okay. Thanks Though I think the real bug is in `klass.cpp`: // Caller needs ResourceMark void Klass::oop_print_on(oop obj, outputStream* st) { I think the RM should be there, not in the caller. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17620#pullrequestreview-1850401159 From eosterlund at openjdk.org Tue Jan 30 08:35:59 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 Jan 2024 08:35:59 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v5] In-Reply-To: References: Message-ID: <-oTV1FsotG3TRFl5YU7wxWgWBktlB4OtFM8CubqaHRg=.22788849-8057-4100-8ecd-85563c7e9090@github.com> > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Erik ?sterlund has updated the pull request incrementally with two additional commits since the last revision: - Add comment - Deal with short far branches on AArch64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17495/files - new: https://git.openjdk.org/jdk/pull/17495/files/54877772..42a21982 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=03-04 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17495.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17495/head:pull/17495 PR: https://git.openjdk.org/jdk/pull/17495 From eosterlund at openjdk.org Tue Jan 30 08:35:59 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 Jan 2024 08:35:59 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 13:48:00 GMT, Doug Simon wrote: >> Erik ?sterlund has updated the pull request incrementally with eight additional commits since the last revision: >> >> - Batch allocate and free CompiledICData >> - JVMCI support >> - Cleanup from FYang >> - Axel suggestions >> - Suggestion from Axel >> - Use relevant global register aliases for clarity >> - Rename register aliases: holder -> data >> - Platform Comment Cleanup > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1004: > >> 1002: >> 1003: int MacroAssembler::ic_check_size() { >> 1004: return NativeInstruction::instruction_size * 7; > > This can be 5 or 7 depending on `MacroAssembler::far_jump` right? If it's 5, then who inserts the extra alignment at the end of the IC check? Good point. I sort of assumed we will always get a far jump, but maybe I'm wrong. Either way, I added a check if the target is far or not, so we can select 5 or 7 accordingly. > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1140: > >> 1138: } >> 1139: >> 1140: void MacroAssembler::align(int modulus, int target) { > > It would be nice to document what this extra `align` function does. Good idea. I wrote a comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1470768117 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1470766460 From avoitylov at openjdk.org Tue Jan 30 08:38:42 2024 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Tue, 30 Jan 2024 08:38:42 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: <0qUYYua-ni7uWjsMme9Rw4jSI026yAHjOOeFhPFoPZs=.b90139a4-12e9-4118-92e9-c3a982b72b34@github.com> References: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> <0qUYYua-ni7uWjsMme9Rw4jSI026yAHjOOeFhPFoPZs=.b90139a4-12e9-4118-92e9-c3a982b72b34@github.com> Message-ID: <1uCU_RdSpe-2gA3HuwLDuYgDIW84Sx_AKb3WRP2P6Ag=.9ad1b164-1cc9-437a-88e3-41af02c3d050@github.com> On Mon, 29 Jan 2024 13:22:43 GMT, Erik ?sterlund wrote: >> Just did an initial read through of the PR. Just added some cleanup suggestion. Also noticed something I though looked wrong in the ARM32 port. >> >> I also went through and tried to find the handful of places in the codebase where the term `ICHolder` (or its derivatives) were still used. Put them in a separate branch to not clutter this PR. Would be nice to take this all the way and not have stale comments or naming lurking about. (Also nuked the `DECC` copy-paste-typo) >> Comment cleanups: >> f1bb02ea472eb314c93d80b830c59bd03e280116 >> >> All platforms use `data` as a register alias for the `CompileICData*` register in the `ic_check`. But c2i and itable stubs still use `holder`. Maybe go all the way here? >> 5422ed32def491bd1e145959b7f3c49c88cfc50e >> >> Also for PPC and s390 I think the code is easier to understand if the global inline cache register aliases these platforms have are used. But maybe that is just me. >> 39c0a7ede5187cba52d6fcf48c0852213c48c899 >> >> As for the implementation I could not see anything wrong (except the ARM32 port). But I'll leave it people with more expertise in this area. > > Thanks for the reviews! I applied the cleanups from @xmas92 and @RealFYang. I added the JVMCI hook for @dougxc and started bulk allocating and freeing CompiledICData to deal with the situation reported by @tschatzl. I haven't touched the ARM32 code though - waiting for @voitylov there. @fisk Sorry for the delay. I prepared a patch that addresses the review comments for ARM32. [jdk_ic_review_comments.patch](https://github.com/openjdk/jdk/files/14094946/jdk_ic_review_comments.patch) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1916323277 From avoitylov at openjdk.org Tue Jan 30 08:49:50 2024 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Tue, 30 Jan 2024 08:49:50 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> References: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> Message-ID: On Fri, 26 Jan 2024 15:12:17 GMT, Axel Boldt-Christmas wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Whitespace fix >> >> Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> > > src/hotspot/cpu/arm/compiledIC_arm.cpp line 107: > >> 105: address stub = find_stub(); >> 106: guarantee(stub != nullptr, "stub not found"); >> 107: > > The other platforms removed the trace logging here. If the ARM porters still want this in at least update to log the correct class name. `s/CompiledDirectStaticCall/CompiledDirectCall/` fixed by removing the trace logging. > src/hotspot/cpu/arm/sharedRuntime_arm.cpp line 631: > >> 629: >> 630: __ ic_check(1 /* end_alignment */); >> 631: __ ldr(Rmethod, Address(receiver_klass, CompiledICData::speculated_method_offset())); > > Maybe I am missing something here but this looks very wrong. The speculated `Klass*` gets loaded into `R4` (which `receiver_klass` alias) in `ic_check` this load would result in loading a `InstanceKlass*` c++ vtable pointer. > `Ricklass` (`R8` alias) contains the `CompiledICData*` . > > I would think the correct diff would be > > - const Register receiver_klass = R4; > - > - __ load_klass(receiver_klass, receiver); > - __ ldr(holder_klass, Address(Ricklass, CompiledICHolder::holder_klass_offset())); > - __ ldr(Rmethod, Address(Ricklass, CompiledICHolder::holder_metadata_offset())); > - __ cmp(receiver_klass, holder_klass); > + __ ic_check(1 /* end_alignment */); > + __ ldr(Rmethod, Address(Ricklass, CompiledICData::speculated_method_offset())); > > > The fact that you say ARM32 tests are passing makes me doubt my understanding of the inline cache. hotspot jtreg tests pass in both variants, but you finding is correct. Updated the code with your suggestion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1470784084 PR Review Comment: https://git.openjdk.org/jdk/pull/17495#discussion_r1470785164 From aph at openjdk.org Tue Jan 30 09:01:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 30 Jan 2024 09:01:42 GMT Subject: RFR: 8322535: Change default AArch64 SpinPause instruction In-Reply-To: References: Message-ID: On Mon, 15 Jan 2024 16:25:08 GMT, Fredrik Bredberg wrote: > The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none". > > However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs. > > This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb". > > Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64. > > Arm tell us that 'yield' is basically implemented as a nop. > > It is not doing much in their current designs, indeed. That's their (questionable?) implementation choice. New AmpereOne chips do however implement yield. > > Since the ISA interface gives us a yield instruction dedicated for this and at least one new chip implements it, it makes sense to me that it is the _default_ instruction, rather than none, going forward. > > Then we can continue to recognize chips that didn't implement yield and try to figure out how to deal with that awkwardness separately, and hope that vendors (indeed including ARM), start implementing the ISA, instead of having us doubling down on horrible hacks that try to quack like a yield instruction. This does have the appeal of sweet reasonableness, I agree. However, the vast majority of non-Apple parts are Arm's designs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1916360990 From eosterlund at openjdk.org Tue Jan 30 09:08:01 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 Jan 2024 09:08:01 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v6] In-Reply-To: References: Message-ID: > ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. > > The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. > > With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. > > I have tested the changes from tier1-7, and run through full aurora performance tests. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: ARM32 fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17495/files - new: https://git.openjdk.org/jdk/pull/17495/files/42a21982..6dd64b50 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17495&range=04-05 Stats: 10 lines in 2 files changed: 0 ins; 9 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17495.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17495/head:pull/17495 PR: https://git.openjdk.org/jdk/pull/17495 From eosterlund at openjdk.org Tue Jan 30 09:08:02 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 Jan 2024 09:08:02 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v3] In-Reply-To: <0qUYYua-ni7uWjsMme9Rw4jSI026yAHjOOeFhPFoPZs=.b90139a4-12e9-4118-92e9-c3a982b72b34@github.com> References: <4Bqh63jS6WGdDtL3wqDZBBJkvH0TiY5vgd5mI_CQrIU=.21db910c-b979-4f9b-8749-fc99653cc670@github.com> <8QPWOBabh_N8EE6mL15EmWC91lc_gqLS3uszsJ4gW4Y=.6fe4fed8-18cc-41ae-bd5c-6a81534d7a4f@github.com> <0qUYYua-ni7uWjsMme9Rw4jSI026yAHjOOeFhPFoPZs=.b90139a4-12e9-4118-92e9-c3a982b72b34@github.com> Message-ID: <9CY4JR4BUFhPtE3_6jFfGBMh5BiHpQJ7vd4PqBBBkx4=.02b1b317-952a-405c-a8a4-26aeeefba4b5@github.com> On Mon, 29 Jan 2024 13:22:43 GMT, Erik ?sterlund wrote: >> Just did an initial read through of the PR. Just added some cleanup suggestion. Also noticed something I though looked wrong in the ARM32 port. >> >> I also went through and tried to find the handful of places in the codebase where the term `ICHolder` (or its derivatives) were still used. Put them in a separate branch to not clutter this PR. Would be nice to take this all the way and not have stale comments or naming lurking about. (Also nuked the `DECC` copy-paste-typo) >> Comment cleanups: >> f1bb02ea472eb314c93d80b830c59bd03e280116 >> >> All platforms use `data` as a register alias for the `CompileICData*` register in the `ic_check`. But c2i and itable stubs still use `holder`. Maybe go all the way here? >> 5422ed32def491bd1e145959b7f3c49c88cfc50e >> >> Also for PPC and s390 I think the code is easier to understand if the global inline cache register aliases these platforms have are used. But maybe that is just me. >> 39c0a7ede5187cba52d6fcf48c0852213c48c899 >> >> As for the implementation I could not see anything wrong (except the ARM32 port). But I'll leave it people with more expertise in this area. > > Thanks for the reviews! I applied the cleanups from @xmas92 and @RealFYang. I added the JVMCI hook for @dougxc and started bulk allocating and freeing CompiledICData to deal with the situation reported by @tschatzl. I haven't touched the ARM32 code though - waiting for @voitylov there. > @fisk Sorry for the delay. I prepared a patch that addresses the review comments for ARM32. > > [jdk_ic_review_comments.patch](https://github.com/openjdk/jdk/files/14094946/jdk_ic_review_comments.patch) Thanks for the fix. I uploaded it to the PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1916371714 From aboldtch at openjdk.org Tue Jan 30 10:08:52 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 30 Jan 2024 10:08:52 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread Message-ID: The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. ------------- Commit messages: - 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread Changes: https://git.openjdk.org/jdk/pull/17626/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17626&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324881 Stats: 65 lines in 3 files changed: 39 ins; 1 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/17626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17626/head:pull/17626 PR: https://git.openjdk.org/jdk/pull/17626 From aph at openjdk.org Tue Jan 30 10:30:24 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 30 Jan 2024 10:30:24 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 04:20:21 GMT, Jiangli Zhou wrote: > > > Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like `objcopy` can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that? > > > > > > I suppose so, but why? > > Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there. > > We ran into issues with older gcc on linux-aarch for partial linking, but the problem may not be older gcc only(?). At the current stage, limiting static/hermetic Java runtime support to only the platforms that support partial linking and `objcopy` seems to be overly restrictive (it does simplify the requirements significantly however :-)): > > The duplicate symbol problems are mostly found in JDK natives and have been resolved already. We've found very few symbol issues with hotspot code so far. As there are portable alternative solutions that can resolve the symbol issues in hotspot, choosing a less portable solution seems not too attractive currently. I believe this to be a mistake. HotSpot, by design, exports only the symbols intended for use by other components. Many of the symbol names are highly generic, and will conflict with application code. Sure, you have enough to be able to do some prototyping, but for real-world deployment you must be able to control symbol exports. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1916535268 From eosterlund at openjdk.org Tue Jan 30 10:53:51 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 Jan 2024 10:53:51 GMT Subject: RFR: 8324933: ConcurrentHashTable::statistics_calculate synchronization is expensive Message-ID: In the ConcurrentHashTable::statistics_calculate function, we enter and exit a ScopedCS with the global counter for every single bucket. This has showed up to be pretty intense on some machines. We should make the synchronization a bit less intense here. This patch adds simple batching so we synchronize once per 128 buckets instead of every single one. ------------- Commit messages: - 8324933: ConcurrentHashTable::statistics_calculate synchronization is expensive Changes: https://git.openjdk.org/jdk/pull/17629/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17629&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324933 Stats: 28 lines in 1 file changed: 16 ins; 1 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/17629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17629/head:pull/17629 PR: https://git.openjdk.org/jdk/pull/17629 From aboldtch at openjdk.org Tue Jan 30 11:23:45 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 30 Jan 2024 11:23:45 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: Message-ID: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> > The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. > > This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. > > Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. > Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. > > Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. > > Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Add regression test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17626/files - new: https://git.openjdk.org/jdk/pull/17626/files/8ace8410..a32104ea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17626&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17626&range=00-01 Stats: 65 lines in 1 file changed: 65 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17626/head:pull/17626 PR: https://git.openjdk.org/jdk/pull/17626 From stuefe at openjdk.org Tue Jan 30 11:51:49 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:49 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible Message-ID: On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes. Before: mov uses 10 instruction bytes: 35 ;; decode_klass_not_null 36 0x00007f8b089e51c4: movabs $0x82000000,%r11 37 0x00007f8b089e51ce: add %r11,%r10 Now: mov uses 6 instruction bytes: 35 ;; decode_klass_not_null 36 0x00007fbe609e51c4: mov $0x82000000,%r11d 37 0x00007fbe609e51ca: add %r11,%r10 Note that this also works with CDS enabled and gives us the motivation to allocate low-address ranges unconditionally, even if the zero-based encoding is not possible. ---------- Tests: tier1 (GHA), tier 2 on x64 linux ------------- Commit messages: - remove obsolete comment - use-32bit-immediate-moves-on-x64-for-klass-encoding-base Changes: https://git.openjdk.org/jdk/pull/17340/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17340&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323497 Stats: 18 lines in 3 files changed: 9 ins; 1 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/17340.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17340/head:pull/17340 PR: https://git.openjdk.org/jdk/pull/17340 From stuefe at openjdk.org Tue Jan 30 11:51:49 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:49 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:09:50 GMT, Thomas Stuefe wrote: > On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes. > > Before: mov uses 10 instruction bytes: > > > 35 ;; decode_klass_not_null > 36 0x00007f8b089e51c4: movabs $0x82000000,%r11 > 37 0x00007f8b089e51ce: add %r11,%r10 > > > Now: mov uses 6 instruction bytes: > > > 35 ;; decode_klass_not_null > 36 0x00007fbe609e51c4: mov $0x82000000,%r11d > 37 0x00007fbe609e51ca: add %r11,%r10 > > > Note that this also works with CDS enabled and gives us the motivation to allocate low-address ranges unconditionally, even if the zero-based encoding is not possible. > > ---------- > > Tests: tier1 (GHA), tier 2 on x64 linux Still waiting for https://git.openjdk.org/jdk/pull/17343 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17340#issuecomment-1914202746 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:09:50 GMT, Thomas Stuefe wrote: > On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes. > > Before: mov uses 10 instruction bytes: > > > 35 ;; decode_klass_not_null > 36 0x00007f8b089e51c4: movabs $0x82000000,%r11 > 37 0x00007f8b089e51ce: add %r11,%r10 > > > Now: mov uses 6 instruction bytes: > > > 35 ;; decode_klass_not_null > 36 0x00007fbe609e51c4: mov $0x82000000,%r11d > 37 0x00007fbe609e51ca: add %r11,%r10 > > > Note that this also works with CDS enabled and gives us the motivation to allocate low-address ranges unconditionally, even if the zero-based encoding is not possible. > > ---------- > > Tests: tier1 (GHA), tier 2 on x64 linux src/hotspot/cpu/x86/assembler_x86.cpp line 13369: > 13367: #ifdef _LP64 > 13368: void Assembler::mov32_or_64(Register dst, int64_t imm) { > 13369: if ((uint64_t)imm < nth_bit(32)) { Drive-by comments: a) macro-assembler stuff like this should be in macroAssembler; b) there is `is_simm32(imm)` for checks like these; c) I did [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) recently, maybe you could just use that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447114489 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:35:13 GMT, Aleksey Shipilev wrote: >> On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes. >> >> Before: mov uses 10 instruction bytes: >> >> >> 35 ;; decode_klass_not_null >> 36 0x00007f8b089e51c4: movabs $0x82000000,%r11 >> 37 0x00007f8b089e51ce: add %r11,%r10 >> >> >> Now: mov uses 6 instruction bytes: >> >> >> 35 ;; decode_klass_not_null >> 36 0x00007fbe609e51c4: mov $0x82000000,%r11d >> 37 0x00007fbe609e51ca: add %r11,%r10 >> >> >> Note that this also works with CDS enabled and gives us the motivation to allocate low-address ranges unconditionally, even if the zero-based encoding is not possible. >> >> ---------- >> >> Tests: tier1 (GHA), tier 2 on x64 linux > > src/hotspot/cpu/x86/assembler_x86.cpp line 13369: > >> 13367: #ifdef _LP64 >> 13368: void Assembler::mov32_or_64(Register dst, int64_t imm) { >> 13369: if ((uint64_t)imm < nth_bit(32)) { > > Drive-by comments: > a) macro-assembler stuff like this should be in macroAssembler; > b) there is `is_simm32(imm)` for checks like these; > c) I did [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) recently, maybe you could just use that? [deleted] ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447120642 From stuefe at openjdk.org Tue Jan 30 11:51:50 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:40:26 GMT, Aleksey Shipilev wrote: >> src/hotspot/cpu/x86/assembler_x86.cpp line 13369: >> >>> 13367: #ifdef _LP64 >>> 13368: void Assembler::mov32_or_64(Register dst, int64_t imm) { >>> 13369: if ((uint64_t)imm < nth_bit(32)) { >> >> Drive-by comments: >> a) macro-assembler stuff like this should be in macroAssembler; >> b) there is `is_simm32(imm)` for checks like these; >> c) I did [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) recently, maybe you could just use that? > > [deleted] God you are fast. movptr looks suitable. Why did you name it "ptr" if its not for pointers? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447127069 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 09:45:44 GMT, Thomas Stuefe wrote: >> [deleted] > > God you are fast. > > movptr looks suitable. Why did you name it "ptr" if its not for pointers? I haven't changed the name of the method. `movptr` means "move something of the width of the machine pointer", like everywhere else in assembler code. That fits your case, I think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447131691 From stuefe at openjdk.org Tue Jan 30 11:51:50 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> On Wed, 10 Jan 2024 09:49:25 GMT, Aleksey Shipilev wrote: >> God you are fast. >> >> movptr looks suitable. Why did you name it "ptr" if its not for pointers? > > I haven't changed the name of the method. `movptr` means "move something of the width of the machine pointer", like everywhere else in assembler code. That fits your case, I think? Hmm, movptr only works its magic if the input is smaller than signed int max (2gb), and it needs one byte more than my variant. base < 2g: 35 ;; decode_klass_not_null 36 0x00007efdcc9e51c4: mov $0x27000000,%r11 37 0x00007efdcc9e51cb: add %r11,%r10 base > 2g: 35 ;; decode_klass_not_null 36 0x00007ff7a89e51c4: movabs $0x82000000,%r11 37 0x00007ff7a89e51ce: add %r11,%r10 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447140322 From qamai at openjdk.org Tue Jan 30 11:51:50 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> Message-ID: <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> On Wed, 10 Jan 2024 09:56:07 GMT, Thomas Stuefe wrote: >> I haven't changed the name of the method. `movptr` means "move something of the width of the machine pointer", like everywhere else in assembler code. That fits your case, I think? > > Hmm, movptr only works its magic if the input is smaller than signed int max (2gb), and it needs one byte more than my variant. > > base < 2g: > > > 35 ;; decode_klass_not_null > 36 0x00007efdcc9e51c4: mov $0x27000000,%r11 > 37 0x00007efdcc9e51cb: add %r11,%r10 > > > base > 2g: > > > 35 ;; decode_klass_not_null > 36 0x00007ff7a89e51c4: movabs $0x82000000,%r11 > 37 0x00007ff7a89e51ce: add %r11,%r10 FYI the logic for immediate matching is: if (is_uimm32(imm)) { movl(dst, imm); } else if (is_simm32(imm)) { movq(dst, imm); } else { mov64(dst, imm); } The reason is that `movl` is smaller than `movq` in code size. Maybe we can change `movptr` to this. I hope that I do not miss anything here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447161579 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> Message-ID: On Wed, 10 Jan 2024 10:14:03 GMT, Quan Anh Mai wrote: >> Hmm, movptr only works its magic if the input is smaller than signed int max (2gb), and it needs one byte more than my variant. >> >> base < 2g: >> >> >> 35 ;; decode_klass_not_null >> 36 0x00007efdcc9e51c4: mov $0x27000000,%r11 >> 37 0x00007efdcc9e51cb: add %r11,%r10 >> >> >> base > 2g: >> >> >> 35 ;; decode_klass_not_null >> 36 0x00007ff7a89e51c4: movabs $0x82000000,%r11 >> 37 0x00007ff7a89e51ce: add %r11,%r10 > > FYI the logic for immediate matching is: > > if (is_uimm32(imm)) { > movl(dst, imm); > } else if (is_simm32(imm)) { > movq(dst, imm); > } else { > mov64(dst, imm); > } > > The reason is that `movl` is smaller than `movq` in code size. > > Maybe we can change `movptr` to this. I hope that I do not miss anything here. Yeah, current `movptr` is sign-extending 32->64, which is not extra-efficient for 32-bit unsigned imms with highest bit set. I think we can indeed check for `is_uimm32` in `movptr` to cover that case too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447171865 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> Message-ID: On Wed, 10 Jan 2024 10:23:02 GMT, Aleksey Shipilev wrote: >> FYI the logic for immediate matching is: >> >> if (is_uimm32(imm)) { >> movl(dst, imm); >> } else if (is_simm32(imm)) { >> movq(dst, imm); >> } else { >> mov64(dst, imm); >> } >> >> The reason is that `movl` is smaller than `movq` in code size. >> >> Maybe we can change `movptr` to this. I hope that I do not miss anything here. > > Yeah, current `movptr` is sign-extending 32->64, which is not extra-efficient for 32-bit unsigned imms with highest bit set. I think we can indeed check for `is_uimm32` in `movptr` to cover that case too. This seems to work: diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.cpp b/src/hotspot/cpu/x86/macroAssembler_x86.cpp index 88296656485..ba4b089c7aa 100644 --- a/src/hotspot/cpu/x86/macroAssembler_x86.cpp +++ b/src/hotspot/cpu/x86/macroAssembler_x86.cpp @@ -2568,7 +2568,9 @@ void MacroAssembler::movptr(Register dst, Address src) { // src should NEVER be a real pointer. Use AddressLiteral for true pointers void MacroAssembler::movptr(Register dst, intptr_t src) { #ifdef _LP64 - if (is_simm32(src)) { + if (is_uimm32(src)) { + movl(dst, checked_cast(src)); + } else if (is_simm32(src)) { movq(dst, checked_cast(src)); } else { mov64(dst, src); diff --git a/src/hotspot/share/asm/assembler.hpp b/src/hotspot/share/asm/assembler.hpp index 7b7dbd4ede7..a533b963844 100644 --- a/src/hotspot/share/asm/assembler.hpp +++ b/src/hotspot/share/asm/assembler.hpp @@ -359,6 +359,7 @@ class AbstractAssembler : public ResourceObj { } static bool is_uimm12(uint64_t x) { return is_uimm(x, 12); } + static bool is_uimm32(uint64_t x) { return is_uimm(x, 32); } // Accessors CodeSection* code_section() const { return _code_section; } Does that work with all narrow klass bases now, @tstuefe? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447200469 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> Message-ID: On Wed, 10 Jan 2024 10:47:35 GMT, Aleksey Shipilev wrote: >> Yeah, current `movptr` is sign-extending 32->64, which is not extra-efficient for 32-bit unsigned imms with highest bit set. I think we can indeed check for `is_uimm32` in `movptr` to cover that case too. > > This seems to work: > > > diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.cpp b/src/hotspot/cpu/x86/macroAssembler_x86.cpp > index 88296656485..ba4b089c7aa 100644 > --- a/src/hotspot/cpu/x86/macroAssembler_x86.cpp > +++ b/src/hotspot/cpu/x86/macroAssembler_x86.cpp > @@ -2568,7 +2568,9 @@ void MacroAssembler::movptr(Register dst, Address src) { > // src should NEVER be a real pointer. Use AddressLiteral for true pointers > void MacroAssembler::movptr(Register dst, intptr_t src) { > #ifdef _LP64 > - if (is_simm32(src)) { > + if (is_uimm32(src)) { > + movl(dst, checked_cast(src)); > + } else if (is_simm32(src)) { > movq(dst, checked_cast(src)); > } else { > mov64(dst, src); > diff --git a/src/hotspot/share/asm/assembler.hpp b/src/hotspot/share/asm/assembler.hpp > index 7b7dbd4ede7..a533b963844 100644 > --- a/src/hotspot/share/asm/assembler.hpp > +++ b/src/hotspot/share/asm/assembler.hpp > @@ -359,6 +359,7 @@ class AbstractAssembler : public ResourceObj { > } > > static bool is_uimm12(uint64_t x) { return is_uimm(x, 12); } > + static bool is_uimm32(uint64_t x) { return is_uimm(x, 32); } > > // Accessors > CodeSection* code_section() const { return _code_section; } > > > Does that work with all narrow klass bases now, @tstuefe? That change seems to hold water on its own, saving quite a bit of code. Filed: [JDK-8323503](https://bugs.openjdk.org/browse/JDK-8323503). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447221010 From stuefe at openjdk.org Tue Jan 30 11:51:50 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> Message-ID: <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFGuK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> On Wed, 10 Jan 2024 11:01:00 GMT, Aleksey Shipilev wrote: >> This seems to work: >> >> >> diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.cpp b/src/hotspot/cpu/x86/macroAssembler_x86.cpp >> index 88296656485..ba4b089c7aa 100644 >> --- a/src/hotspot/cpu/x86/macroAssembler_x86.cpp >> +++ b/src/hotspot/cpu/x86/macroAssembler_x86.cpp >> @@ -2568,7 +2568,9 @@ void MacroAssembler::movptr(Register dst, Address src) { >> // src should NEVER be a real pointer. Use AddressLiteral for true pointers >> void MacroAssembler::movptr(Register dst, intptr_t src) { >> #ifdef _LP64 >> - if (is_simm32(src)) { >> + if (is_uimm32(src)) { >> + movl(dst, checked_cast(src)); >> + } else if (is_simm32(src)) { >> movq(dst, checked_cast(src)); >> } else { >> mov64(dst, src); >> diff --git a/src/hotspot/share/asm/assembler.hpp b/src/hotspot/share/asm/assembler.hpp >> index 7b7dbd4ede7..a533b963844 100644 >> --- a/src/hotspot/share/asm/assembler.hpp >> +++ b/src/hotspot/share/asm/assembler.hpp >> @@ -359,6 +359,7 @@ class AbstractAssembler : public ResourceObj { >> } >> >> static bool is_uimm12(uint64_t x) { return is_uimm(x, 12); } >> + static bool is_uimm32(uint64_t x) { return is_uimm(x, 32); } >> >> // Accessors >> CodeSection* code_section() const { return _code_section; } >> >> >> Does that work with all narrow klass bases now, @tstuefe? > > That change seems to hold water on its own, saving quite a bit of code. Filed: [JDK-8323503](https://bugs.openjdk.org/browse/JDK-8323503). @shipilev I like this. I'll wait for your patch to go in. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447358646 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFGuK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFG uK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> Message-ID: <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> On Wed, 10 Jan 2024 13:03:39 GMT, Thomas Stuefe wrote: >> That change seems to hold water on its own, saving quite a bit of code. Filed: [JDK-8323503](https://bugs.openjdk.org/browse/JDK-8323503). > > @shipilev I like this. I'll wait for your patch to go in. I think it would be better to move to `movptr` here and start testing it. I suspect there are a few places in Hotspot where we count the actual instruction _size_ for some stuff, and even the `mov64` -> `movptr` rewrite without `movl` optimization would highlight it. I agree the final version should be tested when JDK-8323503 is in. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447506558 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFG uK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> Message-ID: On Wed, 10 Jan 2024 15:01:18 GMT, Aleksey Shipilev wrote: >> @shipilev I like this. I'll wait for your patch to go in. > > I think it would be better to move to `movptr` here and start testing it. I suspect there are a few places in Hotspot where we count the actual instruction _size_ for some stuff, and even the `mov64` -> `movptr` rewrite without `movl` optimization would highlight it. I agree the final version should be tested when JDK-8323503 is in. > FYI the logic for immediate matching is: @merykitty, could you point me where that logic is? Cannot find it in current Hotspot sources, and would like to reference it in my PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447773214 From qamai at openjdk.org Tue Jan 30 11:51:50 2024 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFG uK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> Message-ID: On Wed, 10 Jan 2024 18:29:29 GMT, Aleksey Shipilev wrote: >> I think it would be better to move to `movptr` here and start testing it. I suspect there are a few places in Hotspot where we count the actual instruction _size_ for some stuff, and even the `mov64` -> `movptr` rewrite without `movl` optimization would highlight it. I agree the final version should be tested when JDK-8323503 is in. > >> FYI the logic for immediate matching is: > > @merykitty, could you point me where that logic is? Cannot find it in current Hotspot sources, and would like to reference it in my PR. @shipilev It is just the code representation of the matching, the corresponding nodes are `loadConUL32`, `loadConL32` and `loadConL` and the matcher uses node costs to sort out the priority. https://github.com/openjdk/jdk/blob/c1282b57f50002edd08c93aed784390cca83b9b8/src/hotspot/cpu/x86/x86_64.ad#L4807 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447785611 From shade at openjdk.org Tue Jan 30 11:51:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFG uK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> Message-ID: On Wed, 10 Jan 2024 18:41:44 GMT, Quan Anh Mai wrote: >>> FYI the logic for immediate matching is: >> >> @merykitty, could you point me where that logic is? Cannot find it in current Hotspot sources, and would like to reference it in my PR. > > @shipilev It is just the code representation of the matching, the corresponding nodes are `loadConUL32`, `loadConL32` and `loadConL` and the matcher uses node costs to sort out the priority. > > https://github.com/openjdk/jdk/blob/c1282b57f50002edd08c93aed784390cca83b9b8/src/hotspot/cpu/x86/x86_64.ad#L4807 Oh, subtle. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447788810 From stuefe at openjdk.org Tue Jan 30 11:51:50 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFG uK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> Message-ID: On Thu, 11 Jan 2024 11:29:04 GMT, Thomas Stuefe wrote: >> Oh, subtle. Thanks! > > With the new movptr variant, things look good: > > < 2g > > > ;; decode_klass_not_null > 0x00007fcb608681c4: mov $0x18000000,%r11d > 0x00007fcb608681c4: 41 bb 00 00 00 18 > 0x00007fcb608681ca: add %r11,%r10 > 0x00007fcb608681ca: 4d 03 d3 > > > < 4g > > > ;; decode_klass_not_null > 0x00007f382c8681c4: mov $0x85000000,%r11d > 0x00007f382c8681c4: 41 bb 00 00 00 85 > 0x00007f382c8681ca: add %r11,%r10 > 0x00007f382c8681ca: 4d 03 d3 > > >> 4g > > > ;; decode_klass_not_null > 0x00007fd8908681c4: movabs $0x7fd7b6000000,%r11 > 0x00007fd8908681c4: 49 bb 00 00 00 b6 d7 7f 00 00 > 0x00007fd8908681ce: add %r11,%r10 > 0x00007fd8908681ce: 4d 03 d3 > > > I assume if the 32-bit mov variants were to use non-Rxx registers, the REX prefix could also be omitted? Giving us another byte back. > > I'll run tests with the new variant. BTW, I think it would be useful for PrintAssembly to also optionally print the raw bytes in addition to the decoded instruction. It certainly helps me understanding stuff. Do you agree? If yes, I'll prepare a patch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1448708141 From stuefe at openjdk.org Tue Jan 30 11:51:50 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Jan 2024 11:51:50 GMT Subject: RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: <_FLlMLBU4zkxG6w5aXIfSijAtCQ_YvInW45FVdoLl1Q=.079e54d7-b0e2-4aa0-8d08-175e5db48ae5@github.com> <7uVyobQMUlm5ER3En2_-VbgKdbACFSLXnFYdm4weXpc=.b0e7afcd-bb3e-412b-ad85-7136ffa5695d@github.com> <-8ulfVGVzYoXW18mnfuFxhZKWkW9YszZyFG uK8jHbmc=.5fca00d4-5d62-4d3b-ba04-f5fe6fbe0b17@github.com> <03VyCix7YEqESyJV0KEmOD4acCr1ZBkvxwkBPaqBrCM=.d42eee52-160f-481e-9f1f-9a54de45d70e@github.com> Message-ID: On Wed, 10 Jan 2024 18:44:57 GMT, Aleksey Shipilev wrote: >> @shipilev It is just the code representation of the matching, the corresponding nodes are `loadConUL32`, `loadConL32` and `loadConL` and the matcher uses node costs to sort out the priority. >> >> https://github.com/openjdk/jdk/blob/c1282b57f50002edd08c93aed784390cca83b9b8/src/hotspot/cpu/x86/x86_64.ad#L4807 > > Oh, subtle. Thanks! With the new movptr variant, things look good: < 2g ;; decode_klass_not_null 0x00007fcb608681c4: mov $0x18000000,%r11d 0x00007fcb608681c4: 41 bb 00 00 00 18 0x00007fcb608681ca: add %r11,%r10 0x00007fcb608681ca: 4d 03 d3 < 4g ;; decode_klass_not_null 0x00007f382c8681c4: mov $0x85000000,%r11d 0x00007f382c8681c4: 41 bb 00 00 00 85 0x00007f382c8681ca: add %r11,%r10 0x00007f382c8681ca: 4d 03 d3 > 4g ;; decode_klass_not_null 0x00007fd8908681c4: movabs $0x7fd7b6000000,%r11 0x00007fd8908681c4: 49 bb 00 00 00 b6 d7 7f 00 00 0x00007fd8908681ce: add %r11,%r10 0x00007fd8908681ce: 4d 03 d3 I assume if the 32-bit mov variants were to use non-Rxx registers, the REX prefix could also be omitted? Giving us another byte back. I'll run tests with the new variant. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1448706776 From iklam at openjdk.org Tue Jan 30 12:04:41 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 30 Jan 2024 12:04:41 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 10:27:08 GMT, Andrew Haley wrote: > > > > Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like `objcopy` can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that? > > > > > > > > > I suppose so, but why? > > > Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there. > > > > > > We ran into issues with older gcc on linux-aarch for partial linking, but the problem may not be older gcc only(?). At the current stage, limiting static/hermetic Java runtime support to only the platforms that support partial linking and `objcopy` seems to be overly restrictive (it does simplify the requirements significantly however :-)): > > The duplicate symbol problems are mostly found in JDK natives and have been resolved already. We've found very few symbol issues with hotspot code so far. As there are portable alternative solutions that can resolve the symbol issues in hotspot, choosing a less portable solution seems not too attractive currently. > > I believe this to be a mistake. HotSpot, by design, exports only the symbols intended for use by other components. Many of the symbol names are highly generic, and will conflict with application code. > > Sure, you have enough to be able to do some prototyping, but for real-world deployment you must be able to control symbol exports. I agree with Andrew. We don't want the perfect to be the enemy of the good. The only "perfect" solution is putting the HotSpot code in a namespace. This is going to be a huge undertaking. I don't think we have enough interest in the OpenJDK community to make such a change now. I think partial linking with objcopy is a clean solution that's good enough for the actual use cases. If someone wants to use `#define`, they can just make a local branch and add a few `#define` lines in their globalDefinitions.hpp. I suspect the configure script also allows adding C compiler options like `-DThread=HSThread`. `#define` is going to be a whack-a-mole hack. Google may need to isolate the `Thread` symbol, but other people may need to isolate things like `Symbol`, etc. It's not a good idea to add arbitrary `#define` in the HotSpot source code just because someone doesn't like it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1916692309 From aph at openjdk.org Tue Jan 30 12:15:41 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 30 Jan 2024 12:15:41 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: <2YEod9zlzMWNdSu52nPlE8lvTcCwuxilXMluDpHbf7Y=.79994b5e-36e1-44ef-84f3-c995a135096b@github.com> On Tue, 30 Jan 2024 12:01:56 GMT, Ioi Lam wrote: > The only "perfect" solution is putting the HotSpot code in a namespace. This is going to be a huge undertaking. I don't think we have enough interest in the OpenJDK community to make such a change now. I don't think that putting all of the HotSpot code in a namespace. At least, I hope not: it'll mess up debugging so much that it'll be intolerable, IMO, and there will be other side effects. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1916711165 From iklam at openjdk.org Tue Jan 30 12:50:41 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 30 Jan 2024 12:50:41 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: <2YEod9zlzMWNdSu52nPlE8lvTcCwuxilXMluDpHbf7Y=.79994b5e-36e1-44ef-84f3-c995a135096b@github.com> References: <2YEod9zlzMWNdSu52nPlE8lvTcCwuxilXMluDpHbf7Y=.79994b5e-36e1-44ef-84f3-c995a135096b@github.com> Message-ID: On Tue, 30 Jan 2024 12:13:00 GMT, Andrew Haley wrote: > > The only "perfect" solution is putting the HotSpot code in a namespace. This is going to be a huge undertaking. I don't think we have enough interest in the OpenJDK community to make such a change now. > > I don't think that putting all of the HotSpot code in a namespace. At least, I hope not: it'll mess up debugging so much that it'll be intolerable, IMO, and there will be other side effects. I forgot to qualify "perfect" only in the sense of isolating the HotSpot symbols. It's obviously not perfect at all in other aspects. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1916772980 From rrich at openjdk.org Tue Jan 30 13:37:22 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 30 Jan 2024 13:37:22 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Tue, 30 Jan 2024 11:23:45 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Add regression test Thanks for fixing this. There are more occurences of 'current' at synchronizer.cpp:1433 and synchronizer.cpp:1495 you should change. Besides that the changes look good to me. I'll put the change through our CI. Results will arrive tomorrow. Thanks, Richard. ------------- PR Review: https://git.openjdk.org/jdk/pull/17626#pullrequestreview-1851289975 From mbaesken at openjdk.org Tue Jan 30 14:30:32 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 30 Jan 2024 14:30:32 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v5] In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 15:32:47 GMT, Severin Gehwolf wrote: > In a containerized environment with some memory limit this could potentially return a large value for `free_swap_space()`, and a small(er) value for `total_swap_space()`. i.e. `total_swap_space() < free_swap_space()`. Yes this is what we see in our tests last night (after my change). >Please return `-1` if the containerized value is not supported. okay I can do this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1471322500 From pchilanomate at openjdk.org Tue Jan 30 16:16:50 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 30 Jan 2024 16:16:50 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Tue, 30 Jan 2024 02:08:54 GMT, Dean Long wrote: >> Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: >> >> ThreadsListHandle required for Handshake > > My understanding is that monitor chunks are temporary native heap storage for BasicObjectLock records that are being moved from compiled frames to interpreter frames. So the answer is, they will be found in the new interpreter frames that deoptimization pushes, assuming the monitors are not inflated in the process. @dean-long will be actually encounter this native addresses when looking at a monitor owner? Because the value of `adr` that we pass to `JavaThread::is_lock_owned()` is either the address that we read from the markword for the stack-locked case, or the value of the _owner field for the inflated monitor case. If I see `Deoptimization::relock_objects()`, the BasicLock* that we use to relock the monitor should be a valid stack address. Then when we "move" the monitors from the stack to the monitor chunk in [1], `BasicLock::move_to()` will inflate the lock but I don't see we are using the native destination address. Am I missing something here? In other words, do we even need to traverse this monitor chunks? [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/vframeArray.cpp#L90 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1917301675 From coleenp at openjdk.org Tue Jan 30 16:17:04 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 30 Jan 2024 16:17:04 GMT Subject: RFR: 8324861: Exceptions::wrap_dynamic_exception() doesn't have ResourceMark In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 18:34:02 GMT, Leonid Mesnik wrote: > The issue is reproduced with > make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all TEST=runtime/ConstantPool/TestMethodHandleConstant.java TEST_VM_OPTS="-Xlog:all=trace:file=vm.%p.log" > > verified that it doesn't crash anymore. Also, run tier1 for sanity testing. This looks good. I thought we decided that all of the print_on(outputStream*) functions should not have a ResourceMark so that their callers can resource allocate a stream to print to. Maybe that changed with stringStream and logging (?) ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17620#pullrequestreview-1851455231 From coleenp at openjdk.org Tue Jan 30 16:17:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 30 Jan 2024 16:17:47 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Tue, 30 Jan 2024 11:23:45 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Add regression test src/hotspot/share/runtime/synchronizer.cpp line 1323: > 1321: locking_thread = JavaThread::cast(current); > 1322: } > 1323: return inflate(locking_thread, current, object, cause); This looks strange passing locking_thread as nullptr. Why not unconditionally make it current? How can it ever be null? edit: I see, it's guarded by is_lock_owned(). And you want "locking_thread" to be a JavaThread* not Thread* (another source of confusion). This still looks odd. Maybe locking_thread should be: locking_thread = current->is_Java_thread() ? JavaThread::cast(current) : nullptr; Then the LM_LEGACY path makes sense also? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1471433890 From mbaesken at openjdk.org Tue Jan 30 16:17:49 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 30 Jan 2024 16:17:49 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v6] In-Reply-To: References: Message-ID: > Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. > > Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. > PhysicalMemory could be enhanced or a new event added. > > There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and > Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: some Linux container related adjustments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17581/files - new: https://git.openjdk.org/jdk/pull/17581/files/e3bcb12a..ca81d40a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=04-05 Stats: 28 lines in 2 files changed: 15 ins; 9 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From coleenp at openjdk.org Tue Jan 30 16:17:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 30 Jan 2024 16:17:47 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Tue, 30 Jan 2024 15:23:27 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> Add regression test > > src/hotspot/share/runtime/synchronizer.cpp line 1323: > >> 1321: locking_thread = JavaThread::cast(current); >> 1322: } >> 1323: return inflate(locking_thread, current, object, cause); > > This looks strange passing locking_thread as nullptr. Why not unconditionally make it current? How can it ever be null? > > edit: I see, it's guarded by is_lock_owned(). And you want "locking_thread" to be a JavaThread* not Thread* (another source of confusion). This still looks odd. Maybe locking_thread should be: > > locking_thread = current->is_Java_thread() ? JavaThread::cast(current) : nullptr; > > Then the LM_LEGACY path makes sense also? hm, LEGACY doesn't use locking_thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1471454856 From coleenp at openjdk.org Tue Jan 30 16:17:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 30 Jan 2024 16:17:47 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Tue, 30 Jan 2024 15:33:50 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/synchronizer.cpp line 1323: >> >>> 1321: locking_thread = JavaThread::cast(current); >>> 1322: } >>> 1323: return inflate(locking_thread, current, object, cause); >> >> This looks strange passing locking_thread as nullptr. Why not unconditionally make it current? How can it ever be null? >> >> edit: I see, it's guarded by is_lock_owned(). And you want "locking_thread" to be a JavaThread* not Thread* (another source of confusion). This still looks odd. Maybe locking_thread should be: >> >> locking_thread = current->is_Java_thread() ? JavaThread::cast(current) : nullptr; >> >> Then the LM_LEGACY path makes sense also? > > hm, LEGACY doesn't use locking_thread. Maybe this would make more sense to me as Thread* current staying the first parameter, and the second parameter is fast_locking_thread which can be nullptr. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1471456883 From epeter at openjdk.org Tue Jan 30 17:15:53 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 30 Jan 2024 17:15:53 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 22:09:27 GMT, Vladimir Kozlov wrote: > When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). > The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. > > The fix is to start unlocking from most nested/inner monitor. > I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. > > Added regression test with deep nested locks. > Ran tier1-5, Xcomp, stress testing. src/hotspot/share/runtime/deoptimization.cpp line 395: > 393: #endif // !PRODUCT > 394: // Start locking from outermost/oldest frame > 395: for (int i = (chunk->length() - 1); i >= 0 ; i--) { Suggestion: for (int i = (chunk->length() - 1); i >= 0; i--) { ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1471641008 From kvn at openjdk.org Tue Jan 30 18:09:28 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 18:09:28 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 17:13:08 GMT, Emanuel Peter wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix spacing > > src/hotspot/share/runtime/deoptimization.cpp line 395: > >> 393: #endif // !PRODUCT >> 394: // Start locking from outermost/oldest frame >> 395: for (int i = (chunk->length() - 1); i >= 0 ; i--) { > > Suggestion: > > for (int i = (chunk->length() - 1); i >= 0; i--) { Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1471714989 From kvn at openjdk.org Tue Jan 30 18:09:27 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 18:09:27 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: > When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). > The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. > > The fix is to start unlocking from most nested/inner monitor. > I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. > > Added regression test with deep nested locks. > Ran tier1-5, Xcomp, stress testing. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Fix spacing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17600/files - new: https://git.openjdk.org/jdk/pull/17600/files/3ede1f35..9f501294 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17600&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17600&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17600.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17600/head:pull/17600 PR: https://git.openjdk.org/jdk/pull/17600 From egahlin at openjdk.org Tue Jan 30 19:18:33 2024 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 30 Jan 2024 19:18:33 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v6] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 16:17:49 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > some Linux container related adjustments JFR part looks good. Could use try-with-resources in the test, i.e try (Recording r = ...). Have not reviewed OS related code. ------------- Marked as reviewed by egahlin (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17581#pullrequestreview-1852152646 From jiangli at openjdk.org Tue Jan 30 19:40:48 2024 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 30 Jan 2024 19:40:48 GMT Subject: RFR: 8311846: Resolve duplicate 'Thread' related symbols with JDK static linking In-Reply-To: References: Message-ID: On Wed, 17 Jan 2024 00:14:58 GMT, Jiangli Zhou wrote: > Please review this PR with a simple solution for resolving duplicate `Thread` symbol issue. In https://github.com/openjdk/jdk/pull/14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using`-DThread=HotSpotThread`. That would not address issues when symbol were references as string literals. https://github.com/openjdk/jdk/pull/14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach. > > Contributed by Chuck Rasbold and @jianglizhou. We (@AlanBateman, @cushon, @magicus, @jerboaa, @pron, @jianglizhou) discussed this topic via zoom as part of a regular static/hermetic Java discussions. The outcome favors the partial-linking/objcopy to localize symbols for hotspot. Here is a summary: - A general solution is preferred compared to resolving symbol issues case by case. - We can address this for unix-like platforms with toolings supporting partial-linking/objcopy for now. @magicus will provide additional information on supported gcc versions and considerations for Windows support. - There is also a preference to localize symbols automatically without editing the symbol list manually. In our prototype for handling freetype symbols (as mentioned in https://github.com/openjdk/jdk/pull/14808#issuecomment-1631611220), @cjmoon1 looked into using `nm` to generate symbol list and feed that into `objcopy`. That might be do-able for hotspot symbols. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17456#issuecomment-1917753387 From epeter at openjdk.org Tue Jan 30 20:24:25 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 30 Jan 2024 20:24:25 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: <1Js199KTgm1ZgVpbdIkolI3MGxnNCwSGGqlc509MuX4=.170c5ddc-15e4-4346-b638-4e895d5a036f@github.com> On Tue, 30 Jan 2024 18:09:27 GMT, Vladimir Kozlov wrote: >> When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). >> The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. >> >> The fix is to start unlocking from most nested/inner monitor. >> I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. >> >> Added regression test with deep nested locks. >> Ran tier1-5, Xcomp, stress testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing Looks reasonable. test/hotspot/jtreg/compiler/escapeAnalysis/TestNestedRelockAtDeopt.java line 30: > 28: * @requires vm.compMode != "Xint" > 29: * @run main/othervm -XX:-TieredCompilation -XX:-BackgroundCompilation -Xmx128M > 30: * -XX:CompileCommand=exclude,TestNestedRelockAtDeopt::main TestNestedRelockAtDeopt Might it make sense to also have a run without flags, so that outside flags have more of an effect, and can trigger other shapes? ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17600#pullrequestreview-1852259363 PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1471887329 From dlong at openjdk.org Tue Jan 30 20:38:32 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 30 Jan 2024 20:38:32 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: <2BHoPW_UCbqXbUvmCC0ccBqfcNpGtzVi7v1hi--qRH8=.e58c9a9f-6c4f-496a-b3b4-c2b65856e684@github.com> On Tue, 30 Jan 2024 18:09:27 GMT, Vladimir Kozlov wrote: >> When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). >> The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. >> >> The fix is to start unlocking from most nested/inner monitor. >> I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. >> >> Added regression test with deep nested locks. >> Ran tier1-5, Xcomp, stress testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17600#pullrequestreview-1852305382 From kvn at openjdk.org Tue Jan 30 21:22:32 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 21:22:32 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: <1Js199KTgm1ZgVpbdIkolI3MGxnNCwSGGqlc509MuX4=.170c5ddc-15e4-4346-b638-4e895d5a036f@github.com> References: <1Js199KTgm1ZgVpbdIkolI3MGxnNCwSGGqlc509MuX4=.170c5ddc-15e4-4346-b638-4e895d5a036f@github.com> Message-ID: <0uNvtYQdaLoJMcGoC3IwpoSQnZjKT4Inm7DqmHTYNyk=.332ab6a5-298c-47ad-8085-49bcf38bb61b@github.com> On Tue, 30 Jan 2024 20:10:18 GMT, Emanuel Peter wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix spacing > > test/hotspot/jtreg/compiler/escapeAnalysis/TestNestedRelockAtDeopt.java line 30: > >> 28: * @requires vm.compMode != "Xint" >> 29: * @run main/othervm -XX:-TieredCompilation -XX:-BackgroundCompilation -Xmx128M >> 30: * -XX:CompileCommand=exclude,TestNestedRelockAtDeopt::main TestNestedRelockAtDeopt > > Might it make sense to also have a run without flags, so that outside flags have more of an effect, and can trigger other shapes? Not with these changes. I want to keep this test to only verify that the issue is fixed. I will run additional testing with default flags but for this push I want to keep the tests as it is. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1471976601 From dlong at openjdk.org Tue Jan 30 22:10:40 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 30 Jan 2024 22:10:40 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> Message-ID: On Tue, 30 Jan 2024 04:34:14 GMT, David Holmes wrote: >> Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: >> >> ThreadsListHandle required for Handshake > > Okay so anything looking at monitor_chunks is looking at a moving target. They have no idea what stage of moving from compiled to interpreted frames has been reached. So examining monitor_chunks just seems inherently unsafe and totally misguided. On the other hand if you want to know about all monitors then you need to know whether this deopt is in progress or not, and prevent it from starting or wait for it to finish. > > But I also don't see how we examine monitors that are still in compiled frames? `is_lock_owned` does not consider them. > > ??? This seems completely broken. @dholmes-ora said: > But I also don't see how we examine monitors that are still in compiled frames? `is_lock_owned` does not consider them. JavaThread::is_lock_owned() calls Thread::is_lock_owned() first to check if the lock record is on the native stack. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1917977156 From dlong at openjdk.org Tue Jan 30 22:27:16 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 30 Jan 2024 22:27:16 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> Message-ID: On Tue, 30 Jan 2024 22:07:15 GMT, Dean Long wrote: >> Okay so anything looking at monitor_chunks is looking at a moving target. They have no idea what stage of moving from compiled to interpreted frames has been reached. So examining monitor_chunks just seems inherently unsafe and totally misguided. On the other hand if you want to know about all monitors then you need to know whether this deopt is in progress or not, and prevent it from starting or wait for it to finish. >> >> But I also don't see how we examine monitors that are still in compiled frames? `is_lock_owned` does not consider them. >> >> ??? This seems completely broken. > > @dholmes-ora said: > >> But I also don't see how we examine monitors that are still in compiled frames? `is_lock_owned` does not consider them. > > JavaThread::is_lock_owned() calls Thread::is_lock_owned() first to check if the lock record is on the native stack. > @dean-long will be actually encounter this native addresses when looking at a monitor owner? Because the value of `adr` that we pass to `JavaThread::is_lock_owned()` is either the address that we read from the markword for the stack-locked case, or the value of the _owner field for the inflated monitor case. If I see `Deoptimization::relock_objects()`, the BasicLock* that we use to relock the monitor should be a valid stack address. Then when we "move" the monitors from the stack to the monitor chunk in [1], `BasicLock::move_to()` will inflate the lock but I don't see we are using the native destination address. Am I missing something here? In other words, do we even need to traverse this monitor chunks? BasicLock::move_to() doesn't always inflate the lock. First the lock record gets moved from the compiled frame to monitor chunk in `fill_in`, then from the monitor chunk to the interpreter frame in `unpack_on_stack`. Assuming the monitor is not inflated during the move, and lacking additional synchronization, the code calling JavaThread::is_lock_owned() could have read the mark word while the monitor record was moved to the monitor chunks, right? But it's still racy. If the BasicLock gets moved to the interpreter after `JavaThread::is_lock_owned()` has already called `Thread::is_lock_owned()`, then it will return the wrong answer. JavaThread::is_lock_owned is not prepared to deal with the lock moving. That's why I said we might need a better solution. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918002042 From kvn at openjdk.org Tue Jan 30 22:57:13 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 30 Jan 2024 22:57:13 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: <0uNvtYQdaLoJMcGoC3IwpoSQnZjKT4Inm7DqmHTYNyk=.332ab6a5-298c-47ad-8085-49bcf38bb61b@github.com> References: <1Js199KTgm1ZgVpbdIkolI3MGxnNCwSGGqlc509MuX4=.170c5ddc-15e4-4346-b638-4e895d5a036f@github.com> <0uNvtYQdaLoJMcGoC3IwpoSQnZjKT4Inm7DqmHTYNyk=.332ab6a5-298c-47ad-8085-49bcf38bb61b@github.com> Message-ID: On Tue, 30 Jan 2024 21:20:13 GMT, Vladimir Kozlov wrote: >> test/hotspot/jtreg/compiler/escapeAnalysis/TestNestedRelockAtDeopt.java line 30: >> >>> 28: * @requires vm.compMode != "Xint" >>> 29: * @run main/othervm -XX:-TieredCompilation -XX:-BackgroundCompilation -Xmx128M >>> 30: * -XX:CompileCommand=exclude,TestNestedRelockAtDeopt::main TestNestedRelockAtDeopt >> >> Might it make sense to also have a run without flags, so that outside flags have more of an effect, and can trigger other shapes? > > Not with these changes. I want to keep this test to only verify that the issue is fixed. > I will run additional testing with default flags but for this push I want to keep the tests as it is. Testing with default flags caught an other issue. We compile `main()` method as OSR and when we deoptimize due to OOM in `test2` the `catch` is not executed and test exit with: Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main" I would like to file separate bug for it and push current changes without the test modification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1472074591 From pchilanomate at openjdk.org Tue Jan 30 23:33:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 30 Jan 2024 23:33:01 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> Message-ID: <0_kBWgR-HSz2EpQjvXucVsPjXKu5Nee4t-CoxVFAqcI=.40762f68-3b11-4164-8fe8-8f6caa36c238@github.com> On Tue, 30 Jan 2024 22:22:35 GMT, Dean Long wrote: > > @dean-long will be actually encounter this native addresses when looking at a monitor owner? Because the value of `adr` that we pass to `JavaThread::is_lock_owned()` is either the address that we read from the markword for the stack-locked case, or the value of the _owner field for the inflated monitor case. If I see `Deoptimization::relock_objects()`, the BasicLock* that we use to relock the monitor should be a valid stack address. Then when we "move" the monitors from the stack to the monitor chunk in [1], `BasicLock::move_to()` will inflate the lock but I don't see we are using the native destination address. Am I missing something here? In other words, do we even need to traverse this monitor chunks? > > BasicLock::move_to() doesn't always inflate the lock. > > First the lock record gets moved from the compiled frame to monitor chunk in `fill_in`, then from the monitor chunk to the interpreter frame in `unpack_on_stack`. Assuming the monitor is not inflated during the move, and lacking additional synchronization, the code calling JavaThread::is_lock_owned() could have read the mark word while the monitor record was moved to the monitor chunks, right? > > But it's still racy. If the BasicLock gets moved to the interpreter after `JavaThread::is_lock_owned()` has already called `Thread::is_lock_owned()`, then it will return the wrong answer. JavaThread::is_lock_owned is not prepared to deal with the lock moving. That's why I said we might need a better solution. > Right, but are these moves from compiled frame->monitor chunk->interpreter frame actually visible for someone looking for the monitor's owner? `BasicLock::move_to()` doesn't change the markword of an object to point to a different BasicLock*. If anything it will inflate (no-op for recursive case), but a monitor changing from stack-locked to inflated is something expected and should not be an issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918088112 From lmesnik at openjdk.org Tue Jan 30 23:59:04 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 30 Jan 2024 23:59:04 GMT Subject: RFR: 8324861: Exceptions::wrap_dynamic_exception() doesn't have ResourceMark In-Reply-To: References: Message-ID: <8s1wzdFD2FzG5nDWYdBIf-Qhq9eoAu7YyhvNLlQRvgw=.26a31271-0049-4dc5-9078-d4699ddb9659@github.com> On Mon, 29 Jan 2024 18:34:02 GMT, Leonid Mesnik wrote: > The issue is reproduced with > make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all TEST=runtime/ConstantPool/TestMethodHandleConstant.java TEST_VM_OPTS="-Xlog:all=trace:file=vm.%p.log" > > verified that it doesn't crash anymore. Also, run tier1 for sanity testing. @dholmes-ora, @coleenp Thank you for review. As I know, currently the caller should make ResourceMark when printing. (Probably to combine them for different logging functions?) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17620#issuecomment-1918110213 From lmesnik at openjdk.org Tue Jan 30 23:59:04 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 30 Jan 2024 23:59:04 GMT Subject: Integrated: 8324861: Exceptions::wrap_dynamic_exception() doesn't have ResourceMark In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 18:34:02 GMT, Leonid Mesnik wrote: > The issue is reproduced with > make -- run-test JTREG_VERBOSE=all JTREG_RETAIN=all TEST=runtime/ConstantPool/TestMethodHandleConstant.java TEST_VM_OPTS="-Xlog:all=trace:file=vm.%p.log" > > verified that it doesn't crash anymore. Also, run tier1 for sanity testing. This pull request has now been integrated. Changeset: 7d1a4880 Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/7d1a48807a482cd19156298ce21d9492f0d912da Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod 8324861: Exceptions::wrap_dynamic_exception() doesn't have ResourceMark Reviewed-by: dholmes, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/17620 From kvn at openjdk.org Wed Jan 31 00:05:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 31 Jan 2024 00:05:01 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: <1Js199KTgm1ZgVpbdIkolI3MGxnNCwSGGqlc509MuX4=.170c5ddc-15e4-4346-b638-4e895d5a036f@github.com> <0uNvtYQdaLoJMcGoC3IwpoSQnZjKT4Inm7DqmHTYNyk=.332ab6a5-298c-47ad-8085-49bcf38bb61b@github.com> Message-ID: On Tue, 30 Jan 2024 22:48:50 GMT, Vladimir Kozlov wrote: >> Not with these changes. I want to keep this test to only verify that the issue is fixed. >> I will run additional testing with default flags but for this push I want to keep the tests as it is. > > Testing with default flags caught an other issue. We compile `main()` method as OSR and when we deoptimize due to OOM in `test2` the `catch` is not executed and test exit with: > > Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main" > > > I would like to file separate bug for it and push current changes without the test modification. Filed [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) Running testing again without test modification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1472137946 From dlong at openjdk.org Wed Jan 31 00:26:13 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 31 Jan 2024 00:26:13 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: <0_kBWgR-HSz2EpQjvXucVsPjXKu5Nee4t-CoxVFAqcI=.40762f68-3b11-4164-8fe8-8f6caa36c238@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> <0_kBWgR-HSz2EpQjvXucVsPjXKu5Nee4t-CoxVFAqcI=.40762f68-3b11-4164-8fe8-8f6caa36c238@github.com> Message-ID: On Tue, 30 Jan 2024 23:30:28 GMT, Patricio Chilano Mateo wrote: > BasicLock::move_to() doesn't change the markword Good point, I missed that. So where does the markword get updated if the object was stack-locked? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918140106 From dholmes at openjdk.org Wed Jan 31 01:40:02 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 31 Jan 2024 01:40:02 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Tue, 30 Jan 2024 11:23:45 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Add regression test Many thanks for picking this up @xmas92 . I'm very frustrated that this code got broken the way it has been. Given there are only two options here: actual current thread or else a JVMTI suspended thread we are processing on-behalf of, I'm very tempted to introduce a new API for the latter case to use e.g.: `inflate_for` and `ObjectMonitor::set_owner_to`. When dealing with the suspended thread only very limited situations are actually possible so we don't need all the possible cases to be checked in `inflate` or `enter` - it can just be asserted what state the object/monitor must be in. We can then revert inflate/enter to always and only, act on the current thread. I think the resulting code would be much simpler to understand overall. src/hotspot/share/runtime/synchronizer.cpp line 521: > 519: } > 520: > 521: void ObjectSynchronizer::enter(Handle obj, BasicLock* lock, JavaThread* locking_thread, JavaThread* current) { Do we actually need to pass in the current thread? What is it used for - ResourceMarks? src/hotspot/share/runtime/synchronizer.cpp line 1312: > 1310: // Can be called from non JavaThreads (e.g., VMThread) for FastHashCode > 1311: // calculations as part of JVM/TI tagging. > 1312: static bool is_lock_owned(JavaThread* locking_thread, oop obj) { The parameter does not need renaming here, we are asking if some thread is the owner, it is not trying to lock anything. Also you've invalidated the comment by making this take a JavaThread instead of Thread. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17626#pullrequestreview-1852729256 PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1472196825 PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1472193929 From dholmes at openjdk.org Wed Jan 31 01:40:02 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 31 Jan 2024 01:40:02 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Tue, 30 Jan 2024 15:34:54 GMT, Coleen Phillimore wrote: >> hm, LEGACY doesn't use locking_thread. > > Maybe this would make more sense to me as Thread* current staying the first parameter, and the second parameter is fast_locking_thread which can be nullptr. I agree this doesn't quite makes sense to me. `locking_thread` is really `presumed_owner_thread` - we aren't trying to lock here, we are inflating an already locked object, where the passed in thread is the owner. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1472198565 From amitkumar at openjdk.org Wed Jan 31 03:00:00 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 31 Jan 2024 03:00:00 GMT Subject: RFR: 8315762: Update subtype check profile collection on s390x following 8308869 [v2] In-Reply-To: References: <67cqHtFszZQRHZGhqR7LThot4PciYc1tznqzMTV348s=.1c6acb58-170d-44e8-a9ab-171be82f160a@github.com> Message-ID: On Fri, 26 Jan 2024 16:42:26 GMT, Lutz Schmidt wrote: > Please ensure proper testing, as Martin requested. Running tests with -XX:TierStopAtLevel=3 (effectively turning off C2) might be a good idea. I performed the test again and saw few failures with & without my patch. When I checked the log, they were either failing due to 1. timeout 2. result: Failed. Execution failed: `main' threw exception: java.lang.VirtualMachineError: Out of space in CodeCache for method handle intrinsic Should all test pass even with `-XX:TieredStopAtLevel=3`, Because these were failing in master branch as well ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17461#issuecomment-1918292149 From pchilanomate at openjdk.org Wed Jan 31 04:02:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 31 Jan 2024 04:02:01 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> <0_kBWgR-HSz2EpQjvXucVsPjXKu5Nee4t-CoxVFAqcI=.40762f68-3b11-4164-8fe8-8f6caa36c238@github.com> Message-ID: <5TRxQYNay1X0sIr3H9Qhs7WXxojQYElGmhqVkruWYuI=.12eca720-abbd-4d7b-9754-0fe36f6cf46a@github.com> On Wed, 31 Jan 2024 00:23:38 GMT, Dean Long wrote: > > BasicLock::move_to() doesn't change the markword > > Good point, I missed that. So where does the markword get updated if the object was stack-locked? > You mean in this BasicLock::move_to() call? If it was stack-locked, inflation will change the markword to point to the created ObjectMonitor (https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/synchronizer.cpp#L1471). The _owner field of that inflated monitor will contain the BasicLock* that was previously stored in the markword. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918338152 From mdoerr at openjdk.org Wed Jan 31 05:57:02 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 31 Jan 2024 05:57:02 GMT Subject: RFR: 8315762: Update subtype check profile collection on s390x following 8308869 [v2] In-Reply-To: <67cqHtFszZQRHZGhqR7LThot4PciYc1tznqzMTV348s=.1c6acb58-170d-44e8-a9ab-171be82f160a@github.com> References: <67cqHtFszZQRHZGhqR7LThot4PciYc1tznqzMTV348s=.1c6acb58-170d-44e8-a9ab-171be82f160a@github.com> Message-ID: On Thu, 25 Jan 2024 15:20:59 GMT, Amit Kumar wrote: >> s390x Implementation for https://github.com/openjdk/jdk/pull/14375 >> >> Benchmark Result with patch: >> >> Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units >> RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1155.409 ? 43.844 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 726.923 ? 54.536 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 676.462 ? 23.503 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 118.650 ? 2.653 ops/us >> >> >> Without Patch: >> >> Benchmark (typePollution) (typePollutionNotInternalType) Mode Cnt Score Error Units >> RequireNonNullCheckcastScalability.isDuplicated1 false false thrpt 20 1101.248 ? 103.559 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 false true thrpt 20 109.690 ? 3.312 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true false thrpt 20 110.790 ? 7.927 ops/us >> RequireNonNullCheckcastScalability.isDuplicated1 true true thrpt 20 112.244 ? 6.889 ops/us >> >> >> Testing : Fastdebug build + tier1 tests > > Amit Kumar has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' of https://git.openjdk.org/jdk into subtype_v0 > - s390 Port Such kind of test errors are expected when using -XX:TieredStopAtLevel=3. So, if you have also run the jtreg test suite with normal settings, that should be ok. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17461#issuecomment-1918435552 From thartmann at openjdk.org Wed Jan 31 06:08:01 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 31 Jan 2024 06:08:01 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: <_y9COptLKe3VvT6QQp_OJLMhxvfiuFm4UGCFd7c93rA=.abc30ecd-2674-4f51-bd50-1128128c92ba@github.com> On Tue, 30 Jan 2024 18:09:27 GMT, Vladimir Kozlov wrote: >> When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). >> The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. >> >> The fix is to start unlocking from most nested/inner monitor. >> I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. >> >> Added regression test with deep nested locks. >> Ran tier1-5, Xcomp, stress testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing Looks good to me too. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17600#pullrequestreview-1852949962 From dholmes at openjdk.org Wed Jan 31 06:13:05 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 31 Jan 2024 06:13:05 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> Message-ID: On Tue, 30 Jan 2024 22:07:15 GMT, Dean Long wrote: > JavaThread::is_lock_owned() calls Thread::is_lock_owned() first to check if the lock record is on the native stack. @dean-long does "on the native stack" equate to being in a compiled vframe ?? I thought checking the native stack simply found if the BasicObjectLock addr was allocated on the thread's native stack. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918450503 From thartmann at openjdk.org Wed Jan 31 06:28:02 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 31 Jan 2024 06:28:02 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 18:09:27 GMT, Vladimir Kozlov wrote: >> When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). >> The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. >> >> The fix is to start unlocking from most nested/inner monitor. >> I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. >> >> Added regression test with deep nested locks. >> Ran tier1-5, Xcomp, stress testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing test/hotspot/jtreg/compiler/escapeAnalysis/TestNestedRelockAtDeopt.java line 46: > 44: } > 45: } catch (OutOfMemoryError oom) { > 46: arr = null; // Free memory This isn't guaranteed to free any memory, right? Isn't there a high risk that we are hitting another OOME below at the `new ArrayList<>()`? Is that what [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) is about? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1472357585 From aboldtch at openjdk.org Wed Jan 31 06:58:02 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 31 Jan 2024 06:58:02 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Wed, 31 Jan 2024 01:26:54 GMT, David Holmes wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> Add regression test > > src/hotspot/share/runtime/synchronizer.cpp line 521: > >> 519: } >> 520: >> 521: void ObjectSynchronizer::enter(Handle obj, BasicLock* lock, JavaThread* locking_thread, JavaThread* current) { > > Do we actually need to pass in the current thread? What is it used for - ResourceMarks? It is definitely an alternative to simply use Thread::current. That is all it is used for throughout this code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1472378080 From aboldtch at openjdk.org Wed Jan 31 07:28:01 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 31 Jan 2024 07:28:01 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Wed, 31 Jan 2024 01:30:15 GMT, David Holmes wrote: >> Maybe this would make more sense to me as Thread* current staying the first parameter, and the second parameter is fast_locking_thread which can be nullptr. > > I agree this doesn't quite makes sense to me. `locking_thread` is really `presumed_owner_thread` - we aren't trying to lock here, we are inflating an already locked object, where the passed in thread is the owner. I very much agree. This stems from the same issue that cause me to type > `inflate` care about a `locking_thread` is a little unpleasant in my opinion Inflate in all locking modes has nothing to do with what is inflating. It is just that in LM_LIGHTWEIGHT it is also given the responsibility to fix the owner and the lock stack if the `current` thread is the owner. Because the idea is that modifying the lock stack and moving the owner field from anonymous is only done from the owning thread. This is true everywhere except for re-lock, which is the only place that must enter on behalf of another thread. Which means that it must also inflate, but it must set the correct owner, so it must inflate as if it is inflated from another thread. `presumed_owner_thread` is not really correct either. `Inflate` just means create an `ObjectMonitor` if it does not exist and return the `ObjectMonitor*` associated with this object. Regardless if it is locked or not. I like the idea of having `inflate_for`, I think the name can be clearer this is an API that is exclusively needed for LM_LIGHTWEIGHT. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1472401232 From sroy at openjdk.org Wed Jan 31 07:45:06 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 31 Jan 2024 07:45:06 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v11] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Mon, 29 Jan 2024 09:48:40 GMT, Joachim Kern wrote: >> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: >> >> update comment > > src/hotspot/os/aix/os_aix.cpp line 1166: > >> 1164: Search order: >> 1165: libfilename-> load "libfilename.so" first,then load libfilename.a,on failure. >> 1166: In,OpenJ9,the libary with .so extension is loaded first and then .a extension,on failure. > > Hi Suchi, I'm puzzled. Your comment implies for me, that load library gets a 'base' filename without 'lib' prefix and without extension (e.g. 'name'). Then the j9 code creates the filename 'libname.so' first and on failure 'libname.a' second. What about given libname.so explicitly (e.g. libname.so)? Does j9 really use 'libname.a' as a failure fallback in this case? The load library gets the entire library name, after construction from dll_build_name. This is always a .so file name. When .so file name fails to load, we fallback to .a filename. Do i need to mention the filename as libfilename.so then ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1472417159 From dlong at openjdk.org Wed Jan 31 08:03:02 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 31 Jan 2024 08:03:02 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: <5TRxQYNay1X0sIr3H9Qhs7WXxojQYElGmhqVkruWYuI=.12eca720-abbd-4d7b-9754-0fe36f6cf46a@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> <0_kBWgR-HSz2EpQjvXucVsPjXKu5Nee4t-CoxVFAqcI=.40762f68-3b11-4164-8fe8-8f6caa36c238@github.com> <5TRxQYNay1X0sIr3H9Qhs7WXxojQYElGmhqVkruWYuI=.12eca720-abbd-4d7b-9754-0fe36f6cf46a@github.com> Message-ID: On Wed, 31 Jan 2024 03:58:55 GMT, Patricio Chilano Mateo wrote: >>> BasicLock::move_to() doesn't change the markword >> >> Good point, I missed that. So where does the markword get updated if the object was stack-locked? > >> > BasicLock::move_to() doesn't change the markword >> >> Good point, I missed that. So where does the markword get updated if the object was stack-locked? >> > You mean in this BasicLock::move_to() call? If it was stack-locked, inflation will change the markword to point to the created ObjectMonitor (https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/synchronizer.cpp#L1471). The _owner field of that inflated monitor will contain the BasicLock* that was previously stored in the markword. @pchilano: OK so if deoptimization never sets the markword to point at the monitor chunks, then it seems pointless for JavaThread::is_lock_owned() to look at monitor chunks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918576806 From dlong at openjdk.org Wed Jan 31 08:07:07 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 31 Jan 2024 08:07:07 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> Message-ID: On Wed, 31 Jan 2024 06:10:29 GMT, David Holmes wrote: > does "on the native stack" equate to being in a compiled vframe ?? I thought checking the native stack simply found if the BasicObjectLock addr was allocated on the thread's native stack. @dholmes-ora Yes, the lock record could be on the native stack in either an interpreter frame or a compiled frame, but I don't know if that's what you mean by being in a compiled vframe. If you mean what is returned by `compiledVFrame::monitors`, that returns all the monitors locked by the compiled frame, both inflated and fast/stack locked. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918583092 From shade at openjdk.org Wed Jan 31 08:26:10 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 31 Jan 2024 08:26:10 GMT Subject: RFR: 8324833: Signed integer overflows in ABS Message-ID: See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. Additional testing: - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) - [ ] Linux x86_64 fastdebug, `all` ------------- Commit messages: - Also handle the -stride_con with proper uabs - Another attempt at JVMCI fix - Work - Cleaner version - Merge branch 'master' into JDK-8324833-uabs - Rewrite JVMCI usage to avoid the messy overload - Update Changes: https://git.openjdk.org/jdk/pull/17617/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17617&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324833 Stats: 23 lines in 8 files changed: 2 ins; 5 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/17617.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17617/head:pull/17617 PR: https://git.openjdk.org/jdk/pull/17617 From rrich at openjdk.org Wed Jan 31 08:29:04 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 31 Jan 2024 08:29:04 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: <5TRxQYNay1X0sIr3H9Qhs7WXxojQYElGmhqVkruWYuI=.12eca720-abbd-4d7b-9754-0fe36f6cf46a@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> <7p5HrBndOFNCb9jcKdZa9kCkzhPuQVXm-TsCRTRmBmM=.e250e2cd-0fa6-47d3-a3ec-4bd92792e16c@github.com> <0_kBWgR-HSz2EpQjvXucVsPjXKu5Nee4t-CoxVFAqcI=.40762f68-3b11-4164-8fe8-8f6caa36c238@github.com> <5TRxQYNay1X0sIr3H9Qhs7WXxojQYElGmhqVkruWYuI=.12eca720-abbd-4d7b-9754-0fe36f6cf46a@github.com> Message-ID: <8zUdCMcqEjN4faP_uNDYi4zwVIvUnOCXGzTsCfJr5ik=.3383c7aa-068e-4de6-9b39-5f51a193b5f3@github.com> On Wed, 31 Jan 2024 03:58:55 GMT, Patricio Chilano Mateo wrote: >>> BasicLock::move_to() doesn't change the markword >> >> Good point, I missed that. So where does the markword get updated if the object was stack-locked? > >> > BasicLock::move_to() doesn't change the markword >> >> Good point, I missed that. So where does the markword get updated if the object was stack-locked? >> > You mean in this BasicLock::move_to() call? If it was stack-locked, inflation will change the markword to point to the created ObjectMonitor (https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/synchronizer.cpp#L1471). The _owner field of that inflated monitor will contain the BasicLock* that was previously stored in the markword. > @pchilano: OK so if deoptimization never sets the markword to point at the monitor chunks, then it seems pointless for JavaThread::is_lock_owned() to look at monitor chunks. @dean-long I've been following the discussion. I think that it's the correct conclusion from @pchilano's observation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-1918615317 From aboldtch at openjdk.org Wed Jan 31 09:04:09 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 31 Jan 2024 09:04:09 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: References: Message-ID: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> > The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. > > This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. > > Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. > Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. > > Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. > > Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: More restrictive API ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17626/files - new: https://git.openjdk.org/jdk/pull/17626/files/a32104ea..2c266a5b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17626&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17626&range=01-02 Stats: 81 lines in 4 files changed: 13 ins; 26 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/17626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17626/head:pull/17626 PR: https://git.openjdk.org/jdk/pull/17626 From aboldtch at openjdk.org Wed Jan 31 09:24:01 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 31 Jan 2024 09:24:01 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> References: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> Message-ID: On Wed, 31 Jan 2024 09:04:09 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > More restrictive API Created a more restrictive external API with no overloads. API changes ```C++ // Now explicitly takes the locking_thread, does not conflate with current. static void enter(Handle obj, BasicLock* lock, JavaThread* locking_thread); // Unchanged behaviour static ObjectMonitor* inflate(Thread* current, oop obj, const InflateCause cause); // Used with LM_LIGHTWEIGHT to inflate a monitor with another threads lock_stack static ObjectMonitor* inflate_for(JavaThread* thread, oop obj, const InflateCause cause); // Now explicitly takes the locking_thread, does not conflate with current. static void handle_sync_on_value_based_class(Handle obj, JavaThread* locking_thread); // Internal. Shared inflate implementation, LM_LIGHTWEIGHT will inflate with the specific lock_stack // if provided. static ObjectMonitor* inflate_impl(LockStack* lock_stack, oop obj, const InflateCause cause); As for the internal API using the LockStack* makes it more clear that this is a LM_LIGHTWEIGHT parameter. There is still an API issue here, and an assert-ability issue w.r.t. `ObjectSynchronizer::enter` and `ObjectMonitor::enter`. When entering on behalf of another thread the call to `ObjectMonitor::enter` must succeed without contention, that is it is either locked the the other thread and appears recursive, or the object has not escaped and cannot have contention. A solution could be to have a `ObjectMonitor::enter_prologue` which returns false if there is contention. Which can be caught and asserted. And have `ObjectMonitor::enter` only accept `JavaThread::current()`. These are all specialised APIs which only exists for re-lock. The current implementation is correct, but it is still not nice to have a `JavaThread* current` which does not `== JavaThread::current()`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17626#issuecomment-1918698516 From rrich at openjdk.org Wed Jan 31 09:24:02 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 31 Jan 2024 09:24:02 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Wed, 31 Jan 2024 01:37:42 GMT, David Holmes wrote: > I'm very frustrated that this code got broken the way it has been. @dholmes-ora I've introduced the code and I am very offended about the way you express your feelings here and elsewhere about it. People reading this will get the impression that JDK-8227745 is just very inferior in its implementation which it isn't. A lot of work went into it. I wish you had helped reviewing it better. It is my impression that you hardly ever take the time that is necessary to look at the code and think about it before commenting. Am truely offended. In fact there's nothing wrong with one thread inflating a lock on behalf of another thread. It's just the usage of ResourceMark that is wrong. And that is a pretty minor bug. Nothing to be very frustrated about really. Let's just fix it! I'm d'accord to have a specialized version to do the job if the acting thread is not the same as the lock owner. @xmas92's fix isn't that bad either modulo naming of variables but that again isn't anything to get frustrated about. Thanks, Richard. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17626#issuecomment-1918703033 From sgehwolf at openjdk.org Wed Jan 31 10:00:04 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Wed, 31 Jan 2024 10:00:04 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v6] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 16:17:49 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > some Linux container related adjustments Seems OK. The testing could get improved. Only test the non-container case in the existing test and test the container case (including `-1`) in the corresponding container test. test/jdk/jdk/jfr/event/os/TestSwapSpaceEvent.java line 38: > 36: * @requires vm.hasJFR > 37: * @library /test/lib > 38: * @run main/othervm jdk.jfr.event.os.TestSwapSpaceEvent I'd suggest to use `-XX:-UseContainerSupport` on this test to exercise the non-container case on Linux. If you want to test the container case as well, add it to `test/hotspot/jtreg/containers/docker/TestJFREvents.java`. ------------- PR Review: https://git.openjdk.org/jdk/pull/17581#pullrequestreview-1853322888 PR Review Comment: https://git.openjdk.org/jdk/pull/17581#discussion_r1472575158 From tschatzl at openjdk.org Wed Jan 31 11:15:02 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 31 Jan 2024 11:15:02 GMT Subject: RFR: 8324933: ConcurrentHashTable::statistics_calculate synchronization is expensive In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 10:48:18 GMT, Erik ?sterlund wrote: > In the ConcurrentHashTable::statistics_calculate function, we enter and exit a ScopedCS with the global counter for every single bucket. This has showed up to be pretty intense on some machines. We should make the synchronization a bit less intense here. This patch adds simple batching so we synchronize once per 128 buckets instead of every single one. Changes requested by tschatzl (Reviewer). src/hotspot/share/utilities/concurrentHashTable.inline.hpp line 1238: > 1236: } else { > 1237: // Not last batch; walk over the current batch > 1238: batch_end = batch_start + batch_size; Something like (untested): Suggestion: for (size_t start_batch = 0; start_batch <= _table->_size; start_batch += batch_size { size_t batch_end = MIN2(start_batch + batch_size, _table->_size); seems to be much much easier to follow (and corresponds to "usual" code for such loops) than the suggested one. For extra performance `_table->_size` could be hoisted as well, but the compiler may alread do that anyway. ------------- PR Review: https://git.openjdk.org/jdk/pull/17629#pullrequestreview-1853472676 PR Review Comment: https://git.openjdk.org/jdk/pull/17629#discussion_r1472664852 From jkern at openjdk.org Wed Jan 31 11:25:07 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 31 Jan 2024 11:25:07 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v11] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 31 Jan 2024 07:42:49 GMT, Suchismith Roy wrote: >> src/hotspot/os/aix/os_aix.cpp line 1166: >> >>> 1164: Search order: >>> 1165: libfilename-> load "libfilename.so" first,then load libfilename.a,on failure. >>> 1166: In,OpenJ9,the libary with .so extension is loaded first and then .a extension,on failure. >> >> Hi Suchi, I'm puzzled. Your comment implies for me, that load library gets a 'base' filename without 'lib' prefix and without extension (e.g. 'name'). Then the j9 code creates the filename 'libname.so' first and on failure 'libname.a' second. What about given libname.so explicitly (e.g. libname.so)? Does j9 really use 'libname.a' as a failure fallback in this case? > > The load library gets the entire library name, after construction from dll_build_name. This is always a .so file name. When .so file name fails to load, we fallback to .a filename. > Do i need to mention the filename as libfilename.so then ? Yes, I think this would make it clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1472683336 From aph at openjdk.org Wed Jan 31 11:40:04 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 31 Jan 2024 11:40:04 GMT Subject: RFR: 8324833: Signed integer overflows in ABS In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 15:59:49 GMT, Aleksey Shipilev wrote: > See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. > > Additional testing: > - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) > - [ ] Linux x86_64 fastdebug, `all` Looks good. All of these are improvements, I think. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17617#pullrequestreview-1853528192 From mbaesken at openjdk.org Wed Jan 31 11:48:05 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 31 Jan 2024 11:48:05 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v7] In-Reply-To: References: Message-ID: > Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. > > Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. > PhysicalMemory could be enhanced or a new event added. > > There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and > Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Adjust test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17581/files - new: https://git.openjdk.org/jdk/pull/17581/files/ca81d40a..94878ea0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17581&range=05-06 Stats: 11 lines in 1 file changed: 10 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17581/head:pull/17581 PR: https://git.openjdk.org/jdk/pull/17581 From mbaesken at openjdk.org Wed Jan 31 11:48:06 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 31 Jan 2024 11:48:06 GMT Subject: RFR: JDK-8324287: Record total and free swap space in JFR [v6] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 16:17:49 GMT, Matthias Baesken wrote: >> Total and free swap space should be recorded in JFR, because it is important to know e.g. in case of memory shortages. >> >> Currently we only have a container related event (ContainerMemoryUsage) that provides some info but no general event. >> PhysicalMemory could be enhanced or a new event added. >> >> There is already some coding (see Java_com_sun_management_internal_OperatingSystemImpl_getTotalSwapSpaceSize0 and >> Java_com_sun_management_internal_OperatingSystemImpl_getFreeSwapSpaceSize0) for the swap space info retrieval. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > some Linux container related adjustments I adjusted TestSwapSpaceEvent.java; unfortunately the UseContainerSupport globals - flag is Linux-only, so I had to double the jtreg test head. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17581#issuecomment-1918943975 From ayang at openjdk.org Wed Jan 31 12:47:08 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 31 Jan 2024 12:47:08 GMT Subject: RFR: 8324771: Obsolete RAMFraction related flags [v3] In-Reply-To: <9OkBLuBpwhNQVY4Acs2mGBWg8gb3WRnuHW9jXJ0BeLY=.e148d2a6-f07d-4da5-b44a-b170b9bd833b@github.com> References: <9OkBLuBpwhNQVY4Acs2mGBWg8gb3WRnuHW9jXJ0BeLY=.e148d2a6-f07d-4da5-b44a-b170b9bd833b@github.com> Message-ID: On Mon, 29 Jan 2024 15:19:06 GMT, Albert Mingkun Yang wrote: >> Simple obsoleting four related deprecated jvm flags. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/share/runtime/arguments.cpp > > Co-authored-by: Thomas Schatzl <59967451+tschatzl at users.noreply.github.com> Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17592#issuecomment-1919034240 From ayang at openjdk.org Wed Jan 31 12:47:09 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 31 Jan 2024 12:47:09 GMT Subject: Integrated: 8324771: Obsolete RAMFraction related flags In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 16:18:27 GMT, Albert Mingkun Yang wrote: > Simple obsoleting four related deprecated jvm flags. This pull request has now been integrated. Changeset: 725314fb Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/725314fb739e10aa54e224f46d3c71015cf9d158 Stats: 56 lines in 6 files changed: 4 ins; 46 del; 6 mod 8324771: Obsolete RAMFraction related flags Reviewed-by: dholmes, mbaesken, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/17592 From sroy at openjdk.org Wed Jan 31 13:10:25 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 31 Jan 2024 13:10:25 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v12] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: Clarify comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/257f5def..713e514b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=10-11 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From sroy at openjdk.org Wed Jan 31 13:17:21 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 31 Jan 2024 13:17:21 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v13] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: spelling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/713e514b..af761abb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=11-12 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From mdoerr at openjdk.org Wed Jan 31 13:23:06 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 31 Jan 2024 13:23:06 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v13] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 31 Jan 2024 13:17:21 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > spelling src/hotspot/os/aix/os_aix.cpp line 1176: > 1174: strncpy(file_path,filename, buffer_length + 1); > 1175: char* const pointer_to_dot = strrchr(file_path, '.'); > 1176: assert(pointer_to_dot != nullptr, "Attempting to load a shared object without extension? %s", filename); This should not only be an assertion. I think the check could be used instead of the strcmp below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1472813890 From rrich at openjdk.org Wed Jan 31 13:24:02 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 31 Jan 2024 13:24:02 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> References: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> Message-ID: <47J13fZERiiQQHsaZPy1PFuo_0fUiMYoILmWU0QVw2w=.f3358eae-4a97-4b1f-927f-2d9fe6308bf7@github.com> On Wed, 31 Jan 2024 09:04:09 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > More restrictive API `ObjectMonitor::enter(JavaThread* current)` is called near the end of `ObjectSynchronizer::enter()` like this: `monitor->enter(locking_thread)`. Here also the name `current` is not correct because `locking_thread` may be different from the current thread if nested locking was eliminated and the lock got inflated. I see 2 solutions currently: 1. Rename the parameter from `current` to `locking_thread` and assert before contention is handled that `locking_thread` is the current thread. You could introduce a local variable `JavaThread* current` to reduce the necessary changes. 2. Introduce a dedicated new method (`relock_for` or `enter_for`?) to fixup the state of the ObjectMonitor to be called from `Deoptimization::relock_objects`. It would assert that the ObjectMonitor isn't locked by another thread than `locking_thread`, set it as owner, and increment `_recursions` if necessary. At the moment I'd favor 1. With 2. one would like also copy and adapt `ObjectSynchronizer::enter()`. Maybe that would be cleaner. Thanks again Richard. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17626#issuecomment-1919092708 From dnsimon at openjdk.org Wed Jan 31 14:16:07 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 31 Jan 2024 14:16:07 GMT Subject: RFR: 8322630: Remove ICStubs and related safepoints [v6] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 09:08:01 GMT, Erik ?sterlund wrote: >> ICStubs solve an atomicity problem when setting both the destination and data of an inline cache. Unfortunately, it also leads to occasional safepoint carpets when multiple threads need to ICRefill the stubs at the same time, and spurious GuaranteedSafepointInterval "Cleanup" safepoints every second. This patch changes inline caches to not change the data part at all during the nmethod life cycle, hence removing the need for ICStubs. >> >> The new scheme is less stateful. Instead of adding and removing callsite metadata back and forth when transitioning inline cache states, it installs all state any shape of call will ever need at resolution time in a struct that I call CompiledICData. This reduces inline cache state changes to simply changing the destination of the call, and it doesn't really matter what state transitions to what other state. >> >> With this patch, we get rid of ICStub and ICBuffer classes and the related ICRefill and almost all Cleanup safepoints in practice. It also makes the inline cache code much simpler. >> >> I have tested the changes from tier1-7, and run through full aurora performance tests. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > ARM32 fixes The PR to adapt Graal for this change: https://github.com/oracle/graal/pull/8287 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17495#issuecomment-1919186059 From dcubed at openjdk.org Wed Jan 31 15:00:04 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 31 Jan 2024 15:00:04 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: <47J13fZERiiQQHsaZPy1PFuo_0fUiMYoILmWU0QVw2w=.f3358eae-4a97-4b1f-927f-2d9fe6308bf7@github.com> References: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> <47J13fZERiiQQHsaZPy1PFuo_0fUiMYoILmWU0QVw2w=.f3358eae-4a97-4b1f-927f-2d9fe6308bf7@github.com> Message-ID: On Wed, 31 Jan 2024 13:21:13 GMT, Richard Reingruber wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> More restrictive API > > `ObjectMonitor::enter(JavaThread* current)` is called near the end of `ObjectSynchronizer::enter()` like this: `monitor->enter(locking_thread)`. Here also the name `current` is not correct because `locking_thread` may be different from the current thread if nested locking was eliminated and the lock got inflated. > > I see 2 solutions currently: > > 1. Rename the parameter from `current` to `locking_thread` and assert before contention is handled that `locking_thread` is the current thread. You could introduce a local variable `JavaThread* current` to reduce the necessary changes. > 2. Introduce a dedicated new method (`relock_for` or `enter_for`?) to fixup the state of the ObjectMonitor to be called from `Deoptimization::relock_objects`. It would assert that the ObjectMonitor isn't locked by another thread than `locking_thread`, set it as owner, and increment `_recursions` if necessary. > > At the moment I'd favor 1. > With 2. one would like also copy and adapt `ObjectSynchronizer::enter()`. Maybe that would be cleaner. > > Thanks again Richard. @reinrich - > It is my impression that you hardly ever take the time that is necessary to look at the code and think about it before commenting. @dholmes-ora is one of the most careful and thorough reviewers of code on this project. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17626#issuecomment-1919273675 From sroy at openjdk.org Wed Jan 31 15:09:09 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 31 Jan 2024 15:09:09 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v13] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <0VhSPW1LosTGWdDukEFCuBFPHfgij50f1xEcDUzERV0=.ca3ac1dc-42a4-498f-9da7-79ed678b85af@github.com> On Wed, 31 Jan 2024 13:20:52 GMT, Martin Doerr wrote: >> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: >> >> spelling > > src/hotspot/os/aix/os_aix.cpp line 1176: > >> 1174: strncpy(file_path,filename, buffer_length + 1); >> 1175: char* const pointer_to_dot = strrchr(file_path, '.'); >> 1176: assert(pointer_to_dot != nullptr, "Attempting to load a shared object without extension? %s", filename); > > This should not only be an assertion. I think the check could be used instead of the strcmp below. I didn't follow that. You mean i need to keep a check if it is null and print it out ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1472963709 From shade at openjdk.org Wed Jan 31 15:55:25 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 31 Jan 2024 15:55:25 GMT Subject: RFR: 8324833: Signed integer overflows in ABS [v2] In-Reply-To: References: Message-ID: > See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. > > Additional testing: > - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) > - [x] Linux x86_64 fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Unnecessary comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17617/files - new: https://git.openjdk.org/jdk/pull/17617/files/074679a6..00d4d84b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17617&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17617&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17617.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17617/head:pull/17617 PR: https://git.openjdk.org/jdk/pull/17617 From coleenp at openjdk.org Wed Jan 31 16:30:01 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 31 Jan 2024 16:30:01 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> References: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> Message-ID: <0EZ5njYcv7NOGb-aw-r8EndVxHmCJh9zsnX7nTP63xU=.d458e1ea-999b-41f2-bdfa-aa85510b4dc1@github.com> On Wed, 31 Jan 2024 09:04:09 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > More restrictive API I have to say I much prefer the first commit, with the parameters Thread* current and JavaThread* locking_thread reversed. It seems like the most straightforward solution after re-reading the description of the problem. Having lock-stack as a parameter to the general inflate_impl() function gives a different nullptr to deal with. Might as well be locking_thread. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17626#pullrequestreview-1854253679 From coleenp at openjdk.org Wed Jan 31 16:34:13 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 31 Jan 2024 16:34:13 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Wed, 31 Jan 2024 01:21:11 GMT, David Holmes wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> Add regression test > > src/hotspot/share/runtime/synchronizer.cpp line 1312: > >> 1310: // Can be called from non JavaThreads (e.g., VMThread) for FastHashCode >> 1311: // calculations as part of JVM/TI tagging. >> 1312: static bool is_lock_owned(JavaThread* locking_thread, oop obj) { > > The parameter does not need renaming here, we are asking if some thread is the owner, it is not trying to lock anything. Also you've invalidated the comment by making this take a JavaThread instead of Thread. Yes, fix this comment. If the locking_thread is null, then the lock isn't owned by anybody. You can reword the comment to point out that the locking_thread will be null if it's called from non-JavaThreads (eg. VMThread) etc. because the comment is a useful reminder. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473121351 From coleenp at openjdk.org Wed Jan 31 17:23:02 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 31 Jan 2024 17:23:02 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: <5BDbZpN-mgSa_Qlo-yYcmy7hwGny7NLR48X3-sLdmtg=.14bcd6c9-1483-4ad3-8fe8-73a0f470be3d@github.com> On Wed, 31 Jan 2024 06:55:30 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/share/runtime/synchronizer.cpp line 521: >> >>> 519: } >>> 520: >>> 521: void ObjectSynchronizer::enter(Handle obj, BasicLock* lock, JavaThread* locking_thread, JavaThread* current) { >> >> Do we actually need to pass in the current thread? What is it used for - ResourceMarks? > > It is definitely an alternative to simply use Thread::current. That is all it is used for throughout this code. Yes an alternative could be at the beginning of 'enter' do: Thread* current = Thread::current(); You have a comment below about locking_thread not necessarily being current. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473181992 From coleenp at openjdk.org Wed Jan 31 17:23:04 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 31 Jan 2024 17:23:04 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v2] In-Reply-To: References: <_DZtn1YhgytowhfOkO-8sus8U759FOVXwWyPT6fMpLs=.c5c540c2-0633-4520-b27c-3fc1f85b927a@github.com> Message-ID: On Wed, 31 Jan 2024 16:30:58 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/synchronizer.cpp line 1312: >> >>> 1310: // Can be called from non JavaThreads (e.g., VMThread) for FastHashCode >>> 1311: // calculations as part of JVM/TI tagging. >>> 1312: static bool is_lock_owned(JavaThread* locking_thread, oop obj) { >> >> The parameter does not need renaming here, we are asking if some thread is the owner, it is not trying to lock anything. Also you've invalidated the comment by making this take a JavaThread instead of Thread. > > Yes, fix this comment. If the locking_thread is null, then the lock isn't owned by anybody. You can reword the comment to point out that the locking_thread will be null if it's called from non-JavaThreads (eg. VMThread) etc. because the comment is a useful reminder. Add: the parameter should be locking_thread because that's what you're asking about. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473182987 From coleenp at openjdk.org Wed Jan 31 17:23:05 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 31 Jan 2024 17:23:05 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> References: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> Message-ID: On Wed, 31 Jan 2024 09:04:09 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > More restrictive API src/hotspot/share/runtime/synchronizer.cpp line 1335: > 1333: // used when deoptimizing and re-locking locks. See Deoptimization::relock_objects > 1334: assert(locking_thread == nullptr || locking_thread == current || > 1335: locking_thread->is_obj_deopt_suspend(), "must be"); Reversing the args here would help. You could also add that JVMTI gets hash codes in a safepoint that might also inflate the monitor, without any locking thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473185471 From kvn at openjdk.org Wed Jan 31 18:38:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 31 Jan 2024 18:38:01 GMT Subject: RFR: 8324833: Signed integer overflows in ABS [v2] In-Reply-To: References: Message-ID: On Wed, 31 Jan 2024 15:55:25 GMT, Aleksey Shipilev wrote: >> See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) >> - [x] Linux x86_64 fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Unnecessary comment Looks good. I submitted testing. ------------- PR Review: https://git.openjdk.org/jdk/pull/17617#pullrequestreview-1854554005 From coleenp at openjdk.org Wed Jan 31 19:02:08 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 31 Jan 2024 19:02:08 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock Message-ID: This change uses a claim token to allocate multi dimensional arrays rather than holding MultiArray_lock around metaspace allocation. We can't hold a mutex around metaspace allocation because it can create an OOM object and it can also call into JVMTI for a resource exhausted event. Also, we were creating mirrors and more metadata arrays while holding this lock. See the bug for more details and other ideas considered and rejected. Tested with tier1-7. ------------- Commit messages: - Some more cleanups, and make token really recursive. - 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock Changes: https://git.openjdk.org/jdk/pull/17660/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17660&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308745 Stats: 156 lines in 10 files changed: 90 ins; 17 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/17660.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17660/head:pull/17660 PR: https://git.openjdk.org/jdk/pull/17660 From kvn at openjdk.org Wed Jan 31 19:37:02 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 31 Jan 2024 19:37:02 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: On Wed, 31 Jan 2024 06:25:52 GMT, Tobias Hartmann wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix spacing > > test/hotspot/jtreg/compiler/escapeAnalysis/TestNestedRelockAtDeopt.java line 46: > >> 44: } >> 45: } catch (OutOfMemoryError oom) { >> 46: arr = null; // Free memory > > This isn't guaranteed to free any memory, right? Isn't there a high risk that we are hitting another OOME below at the `new ArrayList<>()`? Is that what [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) is about? The failure [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) happens during `newarray` inside `test1()` if it is inlined. If `test1()` is not inlined everything works. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1473370197 From kvn at openjdk.org Wed Jan 31 19:45:09 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 31 Jan 2024 19:45:09 GMT Subject: Integrated: 8324174: assert(m->is_entered(current)) failed: invariant In-Reply-To: References: Message-ID: On Fri, 26 Jan 2024 22:09:27 GMT, Vladimir Kozlov wrote: > When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). > The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. > > The fix is to start unlocking from most nested/inner monitor. > I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. > > Added regression test with deep nested locks. > Ran tier1-5, Xcomp, stress testing. This pull request has now been integrated. Changeset: 5b9b176c Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/5b9b176c6729aeff2a70d304a1ef57da3965fb53 Stats: 150 lines in 2 files changed: 148 ins; 0 del; 2 mod 8324174: assert(m->is_entered(current)) failed: invariant Reviewed-by: epeter, dlong, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/17600 From kvn at openjdk.org Wed Jan 31 19:45:07 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 31 Jan 2024 19:45:07 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: On Tue, 30 Jan 2024 18:09:27 GMT, Vladimir Kozlov wrote: >> When we fail re-allocate scalarized object during deoptimization we unlock all monitors in affected frames and throw OOM exception [JDK-6898462](https://bugs.openjdk.org/browse/JDK-6898462). >> The unlocking was done in incorrect order starting from outermost monitor which cause this assert when we unlock following nested monitor (the same object) - it sees that it was already unlocked. >> >> The fix is to start unlocking from most nested/inner monitor. >> I also noticed that we have incorrect order of frames for re-locking during deoptimization. We should start from outermost frame. Inside frame re-locking order is correct - from outermost monitor. >> >> Added regression test with deep nested locks. >> Ran tier1-5, Xcomp, stress testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing Second round of testing without modifying the test (no default flags run) passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17600#issuecomment-1919808196 From kvn at openjdk.org Wed Jan 31 19:45:08 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 31 Jan 2024 19:45:08 GMT Subject: RFR: 8324174: assert(m->is_entered(current)) failed: invariant [v2] In-Reply-To: References: Message-ID: On Wed, 31 Jan 2024 19:34:15 GMT, Vladimir Kozlov wrote: >> test/hotspot/jtreg/compiler/escapeAnalysis/TestNestedRelockAtDeopt.java line 46: >> >>> 44: } >>> 45: } catch (OutOfMemoryError oom) { >>> 46: arr = null; // Free memory >> >> This isn't guaranteed to free any memory, right? Isn't there a high risk that we are hitting another OOME below at the `new ArrayList<>()`? Is that what [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) is about? > > The failure [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) happens during `newarray` inside `test1()` if it is inlined. If `test1()` is not inlined everything works. The flag `-XX:CompileCommand=exclude,TestNestedRelockAtDeopt::main` prevents inlining and allows test to pass. So I want to push it as it is and work on [JDK-8325003](https://bugs.openjdk.org/browse/JDK-8325003) separately. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17600#discussion_r1473375394 From dchuyko at openjdk.org Wed Jan 31 21:12:19 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 31 Jan 2024 21:12:19 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v25] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 43 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 33 more: https://git.openjdk.org/jdk/compare/5b9b176c...ea0322cd ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=24 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From amenkov at openjdk.org Wed Jan 31 21:34:11 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 31 Jan 2024 21:34:11 GMT Subject: RFR: JDK-8318566: Heap walking functions should not use FilteredFieldStream Message-ID: FilteredFieldStream used by heap walking functions to iterate through klass/superclasses/interfaces fields are known to have poor performance (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). Heap walking API implementation is the last user of the klasses. The fix reworks iteration through klass/superclasses/interfaces fields and drops FilteredFieldStream-related code. Additionally removed/updated includes of reflectionUtils.hpp. Testing: - tier1..4; - test/hotspot/jtreg/vmTestbase/nsk/jvmti (contains tests for different heap walking functions); - new test from #17580 (now the test runs several times faster). ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/17661/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17661&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318566 Stats: 290 lines in 10 files changed: 41 ins; 230 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/17661.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17661/head:pull/17661 PR: https://git.openjdk.org/jdk/pull/17661 From never at openjdk.org Wed Jan 31 21:39:13 2024 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 31 Jan 2024 21:39:13 GMT Subject: RFR: 8324983: race in CompileBroker::possibly_add_compiler_threads Message-ID: The number of active compiler threads is decremented before the compiler thread has actually activated so possibly_add_compiler_thread might start a new thread on the existing JavaThread. This adds a check that it's really exiting before proceeding and some new guarantees that ensure threads aren't started on top running threads. ------------- Commit messages: - 8324983: race in CompileBroker::possibly_add_compiler_threads Changes: https://git.openjdk.org/jdk/pull/17662/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17662&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324983 Stats: 12 lines in 2 files changed: 10 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17662/head:pull/17662 PR: https://git.openjdk.org/jdk/pull/17662 From dcubed at openjdk.org Wed Jan 31 23:04:05 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 31 Jan 2024 23:04:05 GMT Subject: RFR: 8324881: ObjectSynchronizer::inflate(Thread* current...) is invoked for non-current thread [v3] In-Reply-To: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> References: <-hvTIFa2tVtGb-aHxc_Yyt7vozgICxBRtP6WDN038CU=.66ba0001-5cbe-43b3-b42f-93bb7680498a@github.com> Message-ID: On Wed, 31 Jan 2024 09:04:09 GMT, Axel Boldt-Christmas wrote: >> The `ObjectSynchronizer` has always assumed that the `current` parameters are both the current thread as well as the thread that is doing the locking. The only time that we are entering on behalf of another thread is when doing re-locking in deoptimization. This has worked because the deoptee thread is suspended. However ResourceMarks have been using the wrong thread when logging is enabled. >> >> This change `ObjectSynchronizer` instruments the relevant methods with both a `JavaThread* locking_thread` as well as `[Java]Thread* current` to be able to use the correct thread for ResourceMarks. >> >> Having the `inflate` care about a `locking_thread` is a little unpleasant in my opinion. But it is required for LM_LIGHTWEIGHT. >> Would probably be cleaner if the inflate for LM_LIGHTWEIGHT was it's own thing, as it does not share the whole INFLATING protocol. But seems like a future RFE to refactor this code. >> >> Can reproduce a crash by modifying `test/jdk/com/sun/jdi/EATests.java` and using `-XX:DiagnoseSyncOnValueBasedClasses=2` with LM_LEGACY or running `test/jdk/com/sun/jdi/EATests.java` with LM_LIGHTWEIGHT/LM_MONITOR and `-Xlog:monitorinflation=trace`. >> >> Could extend this test to capture this regression in the future (or creating a new test based on the same infrastructure). Will give it an attempt, so we have a regression test for this. But these tests get rather involved as the require a lot of jvmti setup. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > More restrictive API Thumbs up on the technical changes. I have a few comment nits and don't forget to update copyright years. src/hotspot/share/runtime/synchronizer.cpp line 1313: > 1311: } > 1312: > 1313: ObjectMonitor* ObjectSynchronizer::inflate_impl(LockStack* lock_stack, oop object, const InflateCause cause) { `inflate_impl` is the new name for the logic that was in `ObjectSynchronizer::inflate()` and I've crawled thru those changes. I'm happy with this new logic and I'm happy with passing a `lock_stack` value or nullptr because it allows for some nice refactoring in the new `inflate_impl` (like getting rid of the static `is_lock_owned` function). The logic in `ObjectSynchronizer::inflate()` that depended on thecurrent thread being passed has been refactored: - ResourceMarks are now created without the thread parameter - The lock stack manipulations are now dependent on a non-nullptr LockStack being passed instead of a current_thread. Of course, the callers of `inflate_impl` have to pass a non-nullptr LockStack only when the caller is the current JavaThread and `LockingMode == LM_LIGHTWEIGHT`. src/hotspot/share/runtime/synchronizer.cpp line 1325: > 1323: // If using fast-locking and the ObjectMonitor owner > 1324: // is anonymous and the lock_stack contains the > 1325: // object, then we make the lock_stacks owner the Nit typo: s/lock_stacks owner/lock_stack's owner/ src/hotspot/share/runtime/synchronizer.cpp line 1327: > 1325: // object, then we make the lock_stacks owner the > 1326: // ObjectMonitor owner and remove the lock from the > 1327: // lock stack. nit consistency: s/lock stack/lock_stack/ Other places in the same comment paragraph use an `_`. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17626#pullrequestreview-1855004925 PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473580684 PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473556816 PR Review Comment: https://git.openjdk.org/jdk/pull/17626#discussion_r1473558282 From cjplummer at openjdk.org Wed Jan 31 23:39:00 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 31 Jan 2024 23:39:00 GMT Subject: RFR: JDK-8318566: Heap walking functions should not use FilteredFieldStream In-Reply-To: References: Message-ID: On Wed, 31 Jan 2024 21:28:34 GMT, Alex Menkov wrote: > FilteredFieldStream used by heap walking functions to iterate through klass/superclasses/interfaces fields are known to have poor performance (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > Heap walking API implementation is the last user of the klasses. > The fix reworks iteration through klass/superclasses/interfaces fields and drops FilteredFieldStream-related code. > Additionally removed/updated includes of reflectionUtils.hpp. > > Testing: > - tier1..4; > - test/hotspot/jtreg/vmTestbase/nsk/jvmti (contains tests for different heap walking functions); > - new test from #17580 (now the test runs several times faster). src/hotspot/share/prims/jvmtiTagMap.cpp line 453: > 451: InstanceKlass* super_klass = ik->java_super(); > 452: if (super_klass != nullptr) { > 453: start_index += add_instance_fields(super_klass, start_index); Does hotspot have any rules against potentially very deep recursion that can overflow the stack? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17661#discussion_r1473607788