From duke at openjdk.java.net Tue Mar 1 02:26:05 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Tue, 1 Mar 2022 02:26:05 GMT Subject: Integrated: 8282392: [zero] Build broken on AArch64 In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 12:28:39 GMT, Alan Hayward wrote: > 8282392: [zero] Build broken on AArch64 This pull request has now been integrated. Changeset: c1a28aa0 Author: Alan Hayward Committer: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/c1a28aa04ada6c13031eaa85746e6b1d5945d10d Stats: 11 lines in 5 files changed: 6 ins; 0 del; 5 mod 8282392: [zero] Build broken on AArch64 Reviewed-by: aph, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From dholmes at openjdk.java.net Tue Mar 1 02:57:04 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 02:57:04 GMT Subject: RFR: 8281472: JVM options processing silently truncates large illegal options values [v2] In-Reply-To: References: Message-ID: On Fri, 25 Feb 2022 14:24:00 GMT, Harold Seigel wrote: >> Please review this change to fix JDK-8281472. The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type. For example, it rejects values of int options that are not between max_int and min_int. >> >> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add gtest, fix TestParallelGCThreads.java, and revise implementation Hi Harold, New gtest looks good (once you read it extremely carefully to get the types and sizes clear :) ). Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/7522 From iklam at openjdk.java.net Tue Mar 1 03:23:04 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 1 Mar 2022 03:23:04 GMT Subject: RFR: 8281472: JVM options processing silently truncates large illegal options values [v2] In-Reply-To: References: Message-ID: On Fri, 25 Feb 2022 14:24:00 GMT, Harold Seigel wrote: >> Please review this change to fix JDK-8281472. The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type. For example, it rejects values of int options that are not between max_int and min_int. >> >> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add gtest, fix TestParallelGCThreads.java, and revise implementation LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7522 From dholmes at openjdk.java.net Tue Mar 1 04:22:07 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 04:22:07 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:48:35 GMT, Aleksey Shipilev wrote: >>> I think it is confusing to have `AARCH64_PORT_ONLY` defines, to be honest. In the similar cases for X86, we just additionally protect these blocks with !ZERO. Something like: >> >> That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV. > >> That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV. > > From the perspective of Zero maintenance, having the Zero-specific workarounds explicitly doing `!ZERO` is cleaner. This mess is mostly Zero-s problem with idenitifying itself as CPU. So, in my mind, there is little reason to accommodate that problem with "port" defines. Sorry I missed this but this is stylistically awful IMHO. What is AARCH64_PORT_ONLY supposed to mean? IIUC it really means AARCH64_NATIVE_PORT_ONLY or AARCH64_NOT_ZERO_ONLY. I would much rather have seen @shipilev request to use a combination of CPU and ZERO to get this right - as used everywhere else when dealing with ZERO. And we don't use TARGET_ARCH_x anywhere else in the code other than to define the macros. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From dholmes at openjdk.java.net Tue Mar 1 04:41:23 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 04:41:23 GMT Subject: RFR: 8281472: JVM options processing silently truncates large illegal options values [v2] In-Reply-To: References: Message-ID: On Fri, 25 Feb 2022 14:24:00 GMT, Harold Seigel wrote: >> Please review this change to fix JDK-8281472. The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type. For example, it rejects values of int options that are not between max_int and min_int. >> >> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add gtest, fix TestParallelGCThreads.java, and revise implementation Code changes look good too. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7522 From dholmes at openjdk.java.net Tue Mar 1 04:42:15 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 04:42:15 GMT Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2] In-Reply-To: References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com> Message-ID: <3O8kqd5D-3i2uxJN8bFVUYug77Qxp7-f7aAeQ5hauqg=.7780620f-f18a-4456-8275-6dc9156f1c84@github.com> On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes wrote: >> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store. >> >> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not. >> >> Platforms affected: >> - all x86 >> - Zero >> - Windows Aarch64 >> - PPC >> >> Testing: tiers 1-3 >> Additional builds: tiers 4 and 5 >> - builds covered: x86 and Zero >> >> GHA >> - builds covered: Windows-Aarch64 >> >> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC. >> >> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out. >> >> Thanks, >> David > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Remove underscore from name Looks like it was just noise: Benchmark | MacOSX x64 SPECjvm2008-Serial-ParGC | -0.48% ------------- PR: https://git.openjdk.java.net/jdk/pull/7567 From darcy at openjdk.java.net Tue Mar 1 06:20:07 2022 From: darcy at openjdk.java.net (Joe Darcy) Date: Tue, 1 Mar 2022 06:20:07 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: References: Message-ID: On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Adding descriptive comments. test/jdk/java/lang/Math/RoundTests.java line 32: > 30: public static void main(String... args) { > 31: int failures = 0; > 32: for (int i = 0; i < 100000; i++) { Is there an idiom to trigger the auto-vectorization, perhaps using command line arguments, that doesn't bloat the running time of this test? ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From shade at openjdk.java.net Tue Mar 1 08:18:04 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 1 Mar 2022 08:18:04 GMT Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2] In-Reply-To: <3O8kqd5D-3i2uxJN8bFVUYug77Qxp7-f7aAeQ5hauqg=.7780620f-f18a-4456-8275-6dc9156f1c84@github.com> References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com> <3O8kqd5D-3i2uxJN8bFVUYug77Qxp7-f7aAeQ5hauqg=.7780620f-f18a-4456-8275-6dc9156f1c84@github.com> Message-ID: On Tue, 1 Mar 2022 04:39:26 GMT, David Holmes wrote: > Looks like it was just noise: > > Benchmark | MacOSX x64 SPECjvm2008-Serial-ParGC | -0.48% Or intermittent something. Regardless, we can deal with that later. Good to go. ------------- PR: https://git.openjdk.java.net/jdk/pull/7567 From aph at openjdk.java.net Tue Mar 1 09:45:14 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 1 Mar 2022 09:45:14 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:48:35 GMT, Aleksey Shipilev wrote: >>> I think it is confusing to have `AARCH64_PORT_ONLY` defines, to be honest. In the similar cases for X86, we just additionally protect these blocks with !ZERO. Something like: >> >> That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV. > >> That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV. > > From the perspective of Zero maintenance, having the Zero-specific workarounds explicitly doing `!ZERO` is cleaner. This mess is mostly Zero-s problem with idenitifying itself as CPU. So, in my mind, there is little reason to accommodate that problem with "port" defines. > Sorry I missed this but this is stylistically awful IMHO. What is AARCH64_PORT_ONLY supposed to mean? IIUC it really means AARCH64_NATIVE_PORT_ONLY or AARCH64_NOT_ZERO_ONLY. I would much rather have seen @shipilev request to use a combination of CPU and ZERO to get this right - as used everywhere else when dealing with ZERO. And we don't use TARGET_ARCH_x anywhere else in the code other than to define the macros. I've seen this problem repeating over the years, caused by a failure to distinguish between an ifdef for the properties of a particular CPU and one for the implementation of a particular port. Courtesy of Zero, there is not a 1:1 mapping between these, and (judging by the problems we've seen) it is a frequent cause of confusion. Usually we catch the bugs before push, but not always. In my opinion that this is not merely a matter of people making mistakes, but of a style that is confusing, and will continue to be. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From dholmes at openjdk.java.net Tue Mar 1 12:07:08 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 12:07:08 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:21:37 GMT, Alan Hayward wrote: >> 8282392: [zero] Build broken on AArch64 > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Remove NOT_AARCH64_PORT_ONLY I prefer adding the ifndef ZERO to guard these cases than this singular attempt to do something completely different. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From dholmes at openjdk.java.net Tue Mar 1 12:11:04 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 12:11:04 GMT Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2] In-Reply-To: References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com> <3O8kqd5D-3i2uxJN8bFVUYug77Qxp7-f7aAeQ5hauqg=.7780620f-f18a-4456-8275-6dc9156f1c84@github.com> Message-ID: On Tue, 1 Mar 2022 08:14:51 GMT, Aleksey Shipilev wrote: >> Looks like it was just noise: >> >> Benchmark | MacOSX x64 >> SPECjvm2008-Serial-ParGC | -0.48% > >> Looks like it was just noise: >> >> Benchmark | MacOSX x64 SPECjvm2008-Serial-ParGC | -0.48% > > Or intermittent something. Regardless, we can deal with that later. Good to go. Thanks @shipilev ------------- PR: https://git.openjdk.java.net/jdk/pull/7567 From dholmes at openjdk.java.net Tue Mar 1 12:11:04 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 12:11:04 GMT Subject: Integrated: 8227369: pd_disjoint_words_atomic() needs to be atomic In-Reply-To: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com> References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com> Message-ID: On Tue, 22 Feb 2022 05:45:12 GMT, David Holmes wrote: > Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store. > > See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not. > > Platforms affected: > - all x86 > - Zero > - Windows Aarch64 > - PPC > > Testing: tiers 1-3 > Additional builds: tiers 4 and 5 > - builds covered: x86 and Zero > > GHA > - builds covered: Windows-Aarch64 > > The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC. > > For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out. > > Thanks, > David This pull request has now been integrated. Changeset: 44d599aa Author: David Holmes URL: https://git.openjdk.java.net/jdk/commit/44d599aad3994816997a61d9e36265dcefa52965 Stats: 88 lines in 5 files changed: 24 ins; 58 del; 6 mod 8227369: pd_disjoint_words_atomic() needs to be atomic Reviewed-by: eosterlund, mikael, shade, kbarrett, mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/7567 From hseigel at openjdk.java.net Tue Mar 1 13:26:10 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 1 Mar 2022 13:26:10 GMT Subject: RFR: 8281472: JVM options processing silently truncates large illegal options values [v2] In-Reply-To: References: Message-ID: <-gM0C_5CjHB5qzw17hp-8MI3d_3qAecV-ZO1p_CcwdQ=.7629317b-2f0e-4f2f-91f7-daa4760adf65@github.com> On Fri, 25 Feb 2022 14:24:00 GMT, Harold Seigel wrote: >> Please review this change to fix JDK-8281472. The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type. For example, it rejects values of int options that are not between max_int and min_int. >> >> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add gtest, fix TestParallelGCThreads.java, and revise implementation Thanks Ioi and David for your reviews and assistance with this change! ------------- PR: https://git.openjdk.java.net/jdk/pull/7522 From hseigel at openjdk.java.net Tue Mar 1 13:26:11 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 1 Mar 2022 13:26:11 GMT Subject: Integrated: 8281472: JVM options processing silently truncates large illegal options values In-Reply-To: References: Message-ID: On Thu, 17 Feb 2022 19:09:26 GMT, Harold Seigel wrote: > Please review this change to fix JDK-8281472. The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type. For example, it rejects values of int options that are not between max_int and min_int. > > The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64. > > Thanks, Harold This pull request has now been integrated. Changeset: a95edee6 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/a95edee634c6be52043b55d1a8f3df85a58f97c7 Stats: 138 lines in 5 files changed: 126 ins; 3 del; 9 mod 8281472: JVM options processing silently truncates large illegal options values Reviewed-by: dholmes, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/7522 From thartmann at openjdk.java.net Tue Mar 1 14:08:00 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 1 Mar 2022 14:08:00 GMT Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong." [v2] In-Reply-To: <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com> References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com> Message-ID: On Mon, 28 Feb 2022 16:26:35 GMT, Coleen Phillimore wrote: >> This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom. It also makes full_count atomic so that the test in codeCache for printing is correct. This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads. >> Tested with tier1-3 on linux and windows x64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > I misunderstood the UseCodeCacheFlushing flag and make it act like MethodFlushing, which is a whole different flag. Using MethodFlushing instead in the test makes it pass on loom and mainline. Right, I missed that as well. But then I wonder if it wouldn't make sense to include both flags? It seems that all other compiler tests that use `-XX:-MethodFlushing` also set `-XX:-UseCodeCacheFlushing`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7629 From ihse at openjdk.java.net Tue Mar 1 14:20:04 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Tue, 1 Mar 2022 14:20:04 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 09:42:02 GMT, Andrew Haley wrote: >>> That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV. >> >> From the perspective of Zero maintenance, having the Zero-specific workarounds explicitly doing `!ZERO` is cleaner. This mess is mostly Zero-s problem with idenitifying itself as CPU. So, in my mind, there is little reason to accommodate that problem with "port" defines. > >> Sorry I missed this but this is stylistically awful IMHO. What is AARCH64_PORT_ONLY supposed to mean? IIUC it really means AARCH64_NATIVE_PORT_ONLY or AARCH64_NOT_ZERO_ONLY. I would much rather have seen @shipilev request to use a combination of CPU and ZERO to get this right - as used everywhere else when dealing with ZERO. And we don't use TARGET_ARCH_x anywhere else in the code other than to define the macros. > > I've seen this problem repeating over the years, caused by a failure to distinguish between an ifdef for the properties of a particular CPU and one for the implementation of a particular port. Courtesy of Zero, there is not a 1:1 mapping between these, and (judging by the problems we've seen) it is a frequent cause of confusion. Usually we catch the bugs before push, but not always. > > In my opinion that this is not merely a matter of people making mistakes, but of a style that is confusing, and will continue to be. @theRealAph Hear, hear! Zero is causing a conceptual mess the way it is currently "injected" into hotspot. This mess also spills over into the build system. The core problem is that Zero claims to be a "CPU", which it clearly is not. And then a lot of workarounds needs to be installed everywhere to compensate for this confusion. A better solution would be to go back to CPU meaning just CPU, and introduce a second dimension along which to determine zero/non-zero. We might have to bikeshed for a while to find a good name for this second dimension (I can't come up with any ideal names straight away, anyway. INTERPRETED_ONLY vs COMPILED? ZERO vs NON_ZERO? Or ZERO vs ONE? :-)) Doing this would simplify the concepts and logic behind both the Hotspot code and the build system, and eliminate a class of errors that keep popping up all the time, due to this confusing design decision. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From eosterlund at openjdk.java.net Tue Mar 1 14:27:59 2022 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 1 Mar 2022 14:27:59 GMT Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong." [v2] In-Reply-To: <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com> References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com> Message-ID: <0_XVgNFs1aJxbgRs_Jn3O0ClPhYv5xipNBlaCOq84hM=.65dd0833-1847-4dfc-a4a1-176fa7d71c57@github.com> On Mon, 28 Feb 2022 16:26:35 GMT, Coleen Phillimore wrote: >> This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom. It also makes full_count atomic so that the test in codeCache for printing is correct. This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads. >> Tested with tier1-3 on linux and windows x64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > I misunderstood the UseCodeCacheFlushing flag and make it act like MethodFlushing, which is a whole different flag. Using MethodFlushing instead in the test makes it pass on loom and mainline. Looks good! ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7629 From coleenp at openjdk.java.net Tue Mar 1 15:24:04 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Mar 2022 15:24:04 GMT Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong." [v2] In-Reply-To: <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com> References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com> Message-ID: On Mon, 28 Feb 2022 16:26:35 GMT, Coleen Phillimore wrote: >> This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom. It also makes full_count atomic so that the test in codeCache for printing is correct. This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads. >> Tested with tier1-3 on linux and windows x64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > I misunderstood the UseCodeCacheFlushing flag and make it act like MethodFlushing, which is a whole different flag. Using MethodFlushing instead in the test makes it pass on loom and mainline. Thanks Erik and Tobias. Yes, I could add the other flag and do a retest. ------------- PR: https://git.openjdk.java.net/jdk/pull/7629 From coleenp at openjdk.java.net Tue Mar 1 15:33:50 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Mar 2022 15:33:50 GMT Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong." [v3] In-Reply-To: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> Message-ID: > This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom. It also makes full_count atomic so that the test in codeCache for printing is correct. This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads. > Tested with tier1-3 on linux and windows x64. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add back -XX:-UseCodeCacheFlushing also. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7629/files - new: https://git.openjdk.java.net/jdk/pull/7629/files/03950bf0..d7b88ffe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7629&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7629&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7629.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7629/head:pull/7629 PR: https://git.openjdk.java.net/jdk/pull/7629 From coleenp at openjdk.java.net Tue Mar 1 15:33:51 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Mar 2022 15:33:51 GMT Subject: Integrated: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong." In-Reply-To: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com> Message-ID: On Sat, 26 Feb 2022 13:14:57 GMT, Coleen Phillimore wrote: > This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom. It also makes full_count atomic so that the test in codeCache for printing is correct. This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads. > Tested with tier1-3 on linux and windows x64. This pull request has now been integrated. Changeset: 76398c84 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/76398c84007862bdf07cea6be792eca50eec9edd Stats: 8 lines in 3 files changed: 1 ins; 0 del; 7 mod 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong." Reviewed-by: thartmann, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/7629 From bulasevich at openjdk.java.net Tue Mar 1 16:47:04 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Tue, 1 Mar 2022 16:47:04 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> Message-ID: <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> On Mon, 28 Feb 2022 17:00:08 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: >> >> - fix name: is_non_nmethod, adding target_needs_far_branch func >> - change codecache segments order: nonprofiled-nonmethod-profiled >> increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 393: > >> 391: assert(CodeCache::find_blob(entry.target()) != NULL, >> 392: "destination of far call not found in code cache"); >> 393: assert(CodeCache::is_non_nmethod(entry.target()), "must be a call to the code stub"); > > This restricts far calls to be calls of non-nmethod code. Yes. In fact the function is used for non-method code calls only. I put an assert here to be check this fact for future code updates. > src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp line 533: > >> 531: address stub = NULL; >> 532: >> 533: if (a.codecache_branch_needs_far_jump() > > I prefer it to be `a.target_needs_far_jump(dest)`. `codecache_branch` looks like code cache branches need far jumps. It is strange because the code cache is just a storage. It is the code generator has to use far jumps. With this patch I do not change trampoline calls. I change far_jump and far_call procedures only. Instead of far_branches() function we have two functions: - codecache_branch_needs_far_jump to find if we need a far jump for intra-codecache branches - codestub_branch_needs_far_jump to find if we need a far branch for codecache-to-nonmethodEntrypoint branch So in this place I leave codecache_branch_needs_far_jump as exact equivalent of former far_branches() call. > max_distance_to_non_nmethod_heap? > As this is public API, it sounds strange without the start point. Start point is any point in the CodeCache. Will the comment below help? // maximum distance from any point in the CodeCache to any entry point in the non_nmethod CodeCache segment This is really too many words for a self-explanatory function name. > If someone changes positions of the heap, would it work as expected? Sure ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From jbhateja at openjdk.java.net Tue Mar 1 17:07:58 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 1 Mar 2022 17:07:58 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v10] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Review comments resolved.` ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/54d4ea36..3b90ae53 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=08-09 Stats: 12 lines in 2 files changed: 1 ins; 0 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From kvn at openjdk.java.net Tue Mar 1 18:35:11 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 1 Mar 2022 18:35:11 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Wed, 23 Feb 2022 14:19:20 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits: > > - Resolve merge conflict > - Fix TestCountPositives to correctly allow 0 return when expected != len (for now) > - aarch64: fix issue with short inputs divisible by wordSize > - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step. > - Revert micro changes, split out to #7516 > - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives > - Merge branch 'master' into count_positives > - Restore partial vector checks in AVX2 and SSE intrinsic variants > - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral > - Simplify changes to encodeUTF8 > - ... and 19 more: https://git.openjdk.java.net/jdk/compare/5035bf5e...685795ce Looks good. @theRealAph , @a74nh or someone familiar with aarch64 code, please review aarch64 changes. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7231 From lucy at openjdk.java.net Tue Mar 1 18:41:11 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 1 Mar 2022 18:41:11 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Wed, 23 Feb 2022 14:19:20 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits: > > - Resolve merge conflict > - Fix TestCountPositives to correctly allow 0 return when expected != len (for now) > - aarch64: fix issue with short inputs divisible by wordSize > - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step. > - Revert micro changes, split out to #7516 > - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives > - Merge branch 'master' into count_positives > - Restore partial vector checks in AVX2 and SSE intrinsic variants > - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral > - Simplify changes to encodeUTF8 > - ... and 19 more: https://git.openjdk.java.net/jdk/compare/5035bf5e...685795ce @cl4ea How do you plan to proceed with the PPC (PR#7430) and s390 (PR#7604) intrinsic implementations? We (Martin and myself) would not object if you would just integrate the changes into this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Tue Mar 1 19:15:04 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 1 Mar 2022 19:15:04 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 18:32:00 GMT, Vladimir Kozlov wrote: > @theRealAph , @a74nh or someone familiar with aarch64 code, please review aarch64 changes. Note that the aarch64 changes I've put in for now implements `countPositives` to return `0` if there's a negative value anywhere, otherwise `len`. This way we can remove the intrinsic scaffolding for `hasNegatives` once I integrate s390 and ppc changes, but since no precise calculation is implemented for aarch64 then there's no speed-up on micros such as `encodeLatin1LongEnd` for now. I would like to defer such optimization work to a follow-up. @RealLucy yes, I'd like to integrate your changes into this patch and drop the `HasNegativesNode` et.c. You mentioned in e-mail some issue with the code generation on s390 - has that been resolved or identified? ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From coleenp at openjdk.java.net Tue Mar 1 20:58:26 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Mar 2022 20:58:26 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint Message-ID: In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. Tested with tier1-4. ------------- Commit messages: - 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint Changes: https://git.openjdk.java.net/jdk/pull/7650/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7650&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8276711 Stats: 29 lines in 4 files changed: 11 ins; 11 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/7650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7650/head:pull/7650 PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Tue Mar 1 21:29:56 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 21:29:56 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 20:50:23 GMT, Coleen Phillimore wrote: > In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. > This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. > Tested with tier1-4. src/hotspot/share/runtime/arguments.cpp line 4122: > 4120: FLAG_SET_DEFAULT(SegmentedCodeCache, false); > 4121: } > 4122: } Isn't this better placed in Arguments::set_mode_flags where the other compiler related flags get turned off? ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From lucy at openjdk.java.net Tue Mar 1 22:05:04 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 1 Mar 2022 22:05:04 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 19:12:17 GMT, Claes Redestad wrote: >> @theRealAph , @a74nh or someone familiar with aarch64 code, please review aarch64 changes. > >> @theRealAph , @a74nh or someone familiar with aarch64 code, please review aarch64 changes. > > Note that the aarch64 changes I've put in for now implements `countPositives` to return `0` if there's a negative value anywhere, otherwise `len`. This way we can remove the intrinsic scaffolding for `hasNegatives` once I integrate s390 and ppc changes, but since no precise calculation is implemented for aarch64 then there's no speed-up on micros such as `encodeLatin1LongEnd` for now. I would like to defer such optimization work to a follow-up. > > @RealLucy yes, I'd like to integrate your changes into this patch and drop the `HasNegativesNode` et.c. You mentioned in e-mail some issue with the code generation on s390 - has that been resolved or identified? @cl4es This issue in code generation (checking the intrinsic return value for < 0 where it only can be >= 0) is not specific to s390. It is as well visible on x86_64 and on ppc. It happens with the SPECjvm2008 sub-benchmark compiler.sunflow. What's specific to s390 is the fact that this obviously erroneous check causes the benchmark to fail. x86_64 and ppc seem to be immune. Where the immunity comes from is not yet understood. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From xliu at openjdk.java.net Tue Mar 1 22:10:09 2022 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 1 Mar 2022 22:10:09 GMT Subject: RFR: 8281472: JVM options processing silently truncates large illegal options values [v2] In-Reply-To: <-gM0C_5CjHB5qzw17hp-8MI3d_3qAecV-ZO1p_CcwdQ=.7629317b-2f0e-4f2f-91f7-daa4760adf65@github.com> References: <-gM0C_5CjHB5qzw17hp-8MI3d_3qAecV-ZO1p_CcwdQ=.7629317b-2f0e-4f2f-91f7-daa4760adf65@github.com> Message-ID: On Tue, 1 Mar 2022 13:20:31 GMT, Harold Seigel wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> add gtest, fix TestParallelGCThreads.java, and revise implementation > > Thanks Ioi and David for your reviews and assistance with this change! hi, @hseigel, I run into build error on linux/i586. https://github.com/navyxliu/jdk/runs/5382424074?check_suite_focus=true GlobalDefinitions.hpp says that `intx` is 32bit wide on a 32-bit system. (intx)jint_max +1 should overflow on 32bit systems. could you take a look? --lx ------------- PR: https://git.openjdk.java.net/jdk/pull/7522 From kvn at openjdk.java.net Tue Mar 1 22:23:05 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 1 Mar 2022 22:23:05 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 21:27:18 GMT, David Holmes wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > src/hotspot/share/runtime/arguments.cpp line 4122: > >> 4120: FLAG_SET_DEFAULT(SegmentedCodeCache, false); >> 4121: } >> 4122: } > > Isn't this better placed in Arguments::set_mode_flags where the other compiler related flags get turned off? `SegmentedCodeCache` should be set in `compiler/compilerDefinitions.cpp`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compilerDefinitions.cpp#L294 ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From coleen.phillimore at oracle.com Tue Mar 1 22:31:58 2022 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 1 Mar 2022 17:31:58 -0500 Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: <5eaf895d-d653-b684-1145-abde9dd2b497@oracle.com> On 3/1/22 4:29 PM, David Holmes wrote: > On Tue, 1 Mar 2022 20:50:23 GMT, Coleen Phillimore wrote: > >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > src/hotspot/share/runtime/arguments.cpp line 4122: > >> 4120: FLAG_SET_DEFAULT(SegmentedCodeCache, false); >> 4121: } >> 4122: } > Isn't this better placed in Arguments::set_mode_flags where the other compiler related flags get turned off? Well, the default for SegmentedCodeCache is off.? It's set here because it's after that option is found on the command line.? Maybe the line should be if (FLAG_IS_CMDLINE(SegmentedCodeCache)) { } to make it more clear. Coleen > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Tue Mar 1 22:59:59 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Mar 2022 22:59:59 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 22:20:07 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/runtime/arguments.cpp line 4122: >> >>> 4120: FLAG_SET_DEFAULT(SegmentedCodeCache, false); >>> 4121: } >>> 4122: } >> >> Isn't this better placed in Arguments::set_mode_flags where the other compiler related flags get turned off? > > `SegmentedCodeCache` should be set in `compiler/compilerDefinitions.cpp`: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compilerDefinitions.cpp#L294 I replied on email but I don't see it and this is better here. I should really change this to be: if (FLAG_IS_CMDLINE(SegmentedCodeCache)) { } Since this code is checking for someone setting it on the command line with -Xint. The code in compilerDefinitions.cpp sets the flag ergonomically if the code cache is big and if tiered compilation is on. I believe this is after the code in arguments.cpp. So this code won't reset it to true, if I'm following it right. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From kvn at openjdk.java.net Tue Mar 1 23:15:54 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 1 Mar 2022 23:15:54 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 22:57:14 GMT, Coleen Phillimore wrote: >> `SegmentedCodeCache` should be set in `compiler/compilerDefinitions.cpp`: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compilerDefinitions.cpp#L294 > > I replied on email but I don't see it and this is better here. I should really change this to be: > > if (FLAG_IS_CMDLINE(SegmentedCodeCache)) { > } > Since this code is checking for someone setting it on the command line with -Xint. The code in compilerDefinitions.cpp sets the flag ergonomically if the code cache is big and if tiered compilation is on. I believe this is after the code in arguments.cpp. So this code won't reset it to true, if I'm following it right. > edit: yes, I'll check the flag in compilerDefinitions.cpp. I agree that the warning should be issued if flag is set on command line together with `-Xint`. In this case you need add this check to `CompilerConfig::check_args_consistency()` where different compiler flags are set to false when we run with Interpreter: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compilerDefinitions.cpp#L493 ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Tue Mar 1 23:23:12 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 23:23:12 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 23:12:29 GMT, Vladimir Kozlov wrote: >> I replied on email but I don't see it and this is better here. I should really change this to be: >> >> if (FLAG_IS_CMDLINE(SegmentedCodeCache)) { >> } >> Since this code is checking for someone setting it on the command line with -Xint. The code in compilerDefinitions.cpp sets the flag ergonomically if the code cache is big and if tiered compilation is on. I believe this is after the code in arguments.cpp. So this code won't reset it to true, if I'm following it right. >> edit: yes, I'll check the flag in compilerDefinitions.cpp. > > I agree that the warning should be issued if flag is set on command line together with `-Xint`. > In this case you need add this check to `CompilerConfig::check_args_consistency()` where different compiler flags are set to false when we run with Interpreter: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compilerDefinitions.cpp#L493 Thanks Coleen, seems the use of `set_mode_flags` is somewhat flawed as you can change any of the flags that it tries to force on/off simply by having them later on the command-line. You could do as @vnkozlov suggests and move this check into `compilerDefinitions.cpp` as `CompilerConfig::ergo_initialize()` is called just a few lines above this. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Tue Mar 1 23:38:06 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Mar 2022 23:38:06 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 23:20:24 GMT, David Holmes wrote: >> I agree that the warning should be issued if flag is set on command line together with `-Xint`. >> In this case you need add this check to `CompilerConfig::check_args_consistency()` where different compiler flags are set to false when we run with Interpreter: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compilerDefinitions.cpp#L493 > > Thanks Coleen, seems the use of `set_mode_flags` is somewhat flawed as you can change any of the flags that it tries to force on/off simply by having them later on the command-line. > > You could do as @vnkozlov suggests and move this check into `compilerDefinitions.cpp` as `CompilerConfig::ergo_initialize()` is called just a few lines above this. Thanks Vladimir, that's a better place for it. There are other checks for inconsistent flags with -Xint in check_args_consistency(). One of the tests still fails with that change so I'll have to figure out why. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Tue Mar 1 23:39:10 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 1 Mar 2022 23:39:10 GMT Subject: RFR: 8281472: JVM options processing silently truncates large illegal options values [v2] In-Reply-To: References: <-gM0C_5CjHB5qzw17hp-8MI3d_3qAecV-ZO1p_CcwdQ=.7629317b-2f0e-4f2f-91f7-daa4760adf65@github.com> Message-ID: On Tue, 1 Mar 2022 22:06:45 GMT, Xin Liu wrote: >> Thanks Ioi and David for your reviews and assistance with this change! > > hi, @hseigel, > > I run into build error on linux/i586. > https://github.com/navyxliu/jdk/runs/5382424074?check_suite_focus=true > > GlobalDefinitions.hpp says that `intx` is 32bit wide on a 32-bit system. (intx)jint_max +1 should overflow on 32bit systems. could you take a look? > > --lx @navyxliu I will file a new bug for that. ------------- PR: https://git.openjdk.java.net/jdk/pull/7522 From dholmes at openjdk.java.net Wed Mar 2 00:52:01 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Mar 2022 00:52:01 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:21:37 GMT, Alan Hayward wrote: >> 8282392: [zero] Build broken on AArch64 > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Remove NOT_AARCH64_PORT_ONLY Zero is a CPU-agnostic interpreter build, but our builds are inherently CPU-based, so Zero has to represent that "don't care" CPU because it has to replace some CPU specific code with Zero's C code. But then you have to build other CPU-specific parts of the JDK even when using Zero. Hence the approach of using CPU ifdefs combined with a check for Zero. Yes it is awkward and confusing and mistakes can creep in. But I don't think you will come up with anything better for this situation. And the AARCH64_PORT_ONLY is certainly not better IMO. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From coleenp at openjdk.java.net Wed Mar 2 02:03:46 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 02:03:46 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: References: Message-ID: > In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. > This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Move check for -Xint. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7650/files - new: https://git.openjdk.java.net/jdk/pull/7650/files/c369bc4d..462c49c2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7650&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7650&range=00-01 Stats: 16 lines in 2 files changed: 7 ins; 8 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7650/head:pull/7650 PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Wed Mar 2 02:10:01 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 02:10:01 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 23:34:38 GMT, Coleen Phillimore wrote: >> Thanks Coleen, seems the use of `set_mode_flags` is somewhat flawed as you can change any of the flags that it tries to force on/off simply by having them later on the command-line. >> >> You could do as @vnkozlov suggests and move this check into `compilerDefinitions.cpp` as `CompilerConfig::ergo_initialize()` is called just a few lines above this. > > Thanks Vladimir, that's a better place for it. There are other checks for inconsistent flags with -Xint in check_args_consistency(). One of the tests still fails with that change so I'll have to figure out why. I moved the check but TieredStopAtLevel=0 still allocates code heaps for nmethods, so the check only checks for the -Xint in arguments. I don't know if this makes sense or not in the code heap allocation code, but the codecache/cli tests test for sizes of code heaps pass with this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Wed Mar 2 02:27:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Mar 2022 02:27:06 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: References: Message-ID: <9i_DW5kmOKWLi5DPzI5PwX4snTVPpw-Qd9VKpX_kswU=.5c85aaea-7b24-44c3-959b-673ebad12d6d@github.com> On Wed, 2 Mar 2022 02:03:46 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move check for -Xint. src/hotspot/share/compiler/compilerDefinitions.cpp line 528: > 526: // TieredStopAtLevel==0 allocates nmethod space in the code heap with > 527: // SegmentedCodeCache so only disallow the option for -Xint. > 528: if (Arguments::is_interpreter_only() && FLAG_IS_CMDLINE(SegmentedCodeCache)) { You need to check `SegmentedCodeCache==true` else you will generate the warning if someone is explicitly turning it off in combination with `-Xint`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From jbhateja at openjdk.java.net Wed Mar 2 02:44:41 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 2 Mar 2022 02:44:41 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v11] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Removing +LogCompilation flag. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/3b90ae53..57b1b13a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From kvn at openjdk.java.net Wed Mar 2 02:50:05 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 2 Mar 2022 02:50:05 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: <9i_DW5kmOKWLi5DPzI5PwX4snTVPpw-Qd9VKpX_kswU=.5c85aaea-7b24-44c3-959b-673ebad12d6d@github.com> References: <9i_DW5kmOKWLi5DPzI5PwX4snTVPpw-Qd9VKpX_kswU=.5c85aaea-7b24-44c3-959b-673ebad12d6d@github.com> Message-ID: On Wed, 2 Mar 2022 02:24:12 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Move check for -Xint. > > src/hotspot/share/compiler/compilerDefinitions.cpp line 528: > >> 526: // TieredStopAtLevel==0 allocates nmethod space in the code heap with >> 527: // SegmentedCodeCache so only disallow the option for -Xint. >> 528: if (Arguments::is_interpreter_only() && FLAG_IS_CMDLINE(SegmentedCodeCache)) { > > You need to check `SegmentedCodeCache==true` else you will generate the warning if someone is explicitly turning it off in combination with `-Xint`. Yes, code should follow pattern in previous lines: if (SegmentedCodeCache) { if (!FLAG_IS_DEFAULT(SegmentedCodeCache)) { warning("SegmentedCodeCache has no meaningful effect with -Xint"); } FLAG_SET_CMDLINE(SegmentedCodeCache, false); } ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From kvn at openjdk.java.net Wed Mar 2 02:50:04 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 2 Mar 2022 02:50:04 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: References: Message-ID: <11s6TR_F_hrI8FgHuuHdH5TtKY7FSG-709rG9WAC_z4=.d85ea284-e973-4cb3-8e6d-badfdb6ff41b@github.com> On Wed, 2 Mar 2022 02:03:46 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move check for -Xint. src/hotspot/share/compiler/compilerDefinitions.cpp line 526: > 524: } > 525: > 526: // TieredStopAtLevel==0 allocates nmethod space in the code heap with I would prefer to not have special case `TieredStopAtLevel==0`. Unless you want to test loom when JIT compilers are disabled. Currently we don't run any tests with such setting. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From kvn at openjdk.java.net Wed Mar 2 02:50:05 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 2 Mar 2022 02:50:05 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: References: <9i_DW5kmOKWLi5DPzI5PwX4snTVPpw-Qd9VKpX_kswU=.5c85aaea-7b24-44c3-959b-673ebad12d6d@github.com> Message-ID: On Wed, 2 Mar 2022 02:46:25 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/compiler/compilerDefinitions.cpp line 528: >> >>> 526: // TieredStopAtLevel==0 allocates nmethod space in the code heap with >>> 527: // SegmentedCodeCache so only disallow the option for -Xint. >>> 528: if (Arguments::is_interpreter_only() && FLAG_IS_CMDLINE(SegmentedCodeCache)) { >> >> You need to check `SegmentedCodeCache==true` else you will generate the warning if someone is explicitly turning it off in combination with `-Xint`. > > Yes, code should follow pattern in previous lines: > > if (SegmentedCodeCache) { > if (!FLAG_IS_DEFAULT(SegmentedCodeCache)) { > warning("SegmentedCodeCache has no meaningful effect with -Xint"); > } > FLAG_SET_CMDLINE(SegmentedCodeCache, false); > } And you should include it in previous scope which checks `if (Arguments::is_interpreter_only())` ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From aturbanov at openjdk.java.net Wed Mar 2 06:28:38 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Wed, 2 Mar 2022 06:28:38 GMT Subject: RFR: 8282523: Fix 'hierachy' typo Message-ID: Fix multiple 'hierachy' -> 'hierarchy' typos ------------- Commit messages: - [PATCH] Fix multiple 'hierachy' -> 'hierarchy' typos Changes: https://git.openjdk.java.net/jdk/pull/7474/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7474&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282523 Stats: 20 lines in 5 files changed: 0 ins; 0 del; 20 mod Patch: https://git.openjdk.java.net/jdk/pull/7474.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7474/head:pull/7474 PR: https://git.openjdk.java.net/jdk/pull/7474 From jiefu at openjdk.java.net Wed Mar 2 06:49:05 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 2 Mar 2022 06:49:05 GMT Subject: RFR: 8282523: Fix 'hierachy' typo In-Reply-To: References: Message-ID: On Tue, 15 Feb 2022 09:32:52 GMT, Andrey Turbanov wrote: > Fix multiple 'hierachy' -> 'hierarchy' typos LGTM ------------- Marked as reviewed by jiefu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7474 From duke at openjdk.java.net Wed Mar 2 06:57:08 2022 From: duke at openjdk.java.net (duke) Date: Wed, 2 Mar 2022 06:57:08 GMT Subject: Withdrawn: 8279143: Undefined behaviours in globalDefinitions.hpp In-Reply-To: References: Message-ID: On Thu, 23 Dec 2021 15:58:53 GMT, Quan Anh Mai wrote: > Hi, > > This patch replaces undefined behaviours in globalDefinitions.hpp by proper well-defined ones. > > Thank you very much. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/6930 From aph at openjdk.java.net Wed Mar 2 10:26:05 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 2 Mar 2022 10:26:05 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 00:48:23 GMT, David Holmes wrote: > Zero is a CPU-agnostic interpreter build, but our builds are inherently CPU-based, so Zero has to represent that "don't care" CPU because it has to replace some CPU specific code with Zero's C code. But then you have to build other CPU-specific parts of the JDK even when using Zero. Hence the approach of using CPU ifdefs combined with a check for Zero. Yes it is awkward and confusing and mistakes can creep in. But I don't think you will come up with anything better for this situation. And the AARCH64_PORT_ONLY is certainly not better IMO. Can you explain why it's not better? I don't want to waste your time by prolonging discussions unnecessarily, but it seems to me that this nomenclature conveys exactly what we mean: not so much running on a particular CPU, but a particular port. There's a matrix of possible combinations of instruction set architecture and port, with the ports ranging from highly-tuned and customized to completely portable, with space in between. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From kbarrett at openjdk.java.net Wed Mar 2 12:34:59 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 2 Mar 2022 12:34:59 GMT Subject: RFR: 8282523: Fix 'hierachy' typo In-Reply-To: References: Message-ID: <46N_cMZ1S7gBpxZTu9RyQ_ntJ04HYEo5yXe59R5MPug=.b4986201-8d24-49bb-abe3-b36ee24592a5@github.com> On Tue, 15 Feb 2022 09:32:52 GMT, Andrey Turbanov wrote: > Fix multiple 'hierachy' -> 'hierarchy' typos Looks good. Some of these are ancient. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7474 From dholmes at openjdk.java.net Wed Mar 2 12:55:57 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Mar 2022 12:55:57 GMT Subject: RFR: 8282523: Fix 'hierachy' typo In-Reply-To: References: Message-ID: On Tue, 15 Feb 2022 09:32:52 GMT, Andrey Turbanov wrote: > Fix multiple 'hierachy' -> 'hierarchy' typos Looks good and trivial. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7474 From aturbanov at openjdk.java.net Wed Mar 2 13:09:05 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Wed, 2 Mar 2022 13:09:05 GMT Subject: Integrated: 8282523: Fix 'hierachy' typo In-Reply-To: References: Message-ID: On Tue, 15 Feb 2022 09:32:52 GMT, Andrey Turbanov wrote: > Fix multiple 'hierachy' -> 'hierarchy' typos This pull request has now been integrated. Changeset: d80f6971 Author: Andrey Turbanov URL: https://git.openjdk.java.net/jdk/commit/d80f69718233c484e3c1536ffb793116c1adc058 Stats: 19 lines in 5 files changed: 0 ins; 0 del; 19 mod 8282523: Fix 'hierachy' typo Reviewed-by: jiefu, kbarrett, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7474 From coleenp at openjdk.java.net Wed Mar 2 13:22:01 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 13:22:01 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: References: <9i_DW5kmOKWLi5DPzI5PwX4snTVPpw-Qd9VKpX_kswU=.5c85aaea-7b24-44c3-959b-673ebad12d6d@github.com> Message-ID: On Wed, 2 Mar 2022 02:48:00 GMT, Vladimir Kozlov wrote: >> Yes, code should follow pattern in previous lines: >> >> if (SegmentedCodeCache) { >> if (!FLAG_IS_DEFAULT(SegmentedCodeCache)) { >> warning("SegmentedCodeCache has no meaningful effect with -Xint"); >> } >> FLAG_SET_CMDLINE(SegmentedCodeCache, false); >> } > > And you should include it in previous scope which checks `if (Arguments::is_interpreter_only())` The previous scope checks CompilerConfig::is_interpreter_only() which includes TieredStopAtLevel == 0, so it's not the same there. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Wed Mar 2 13:26:00 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 13:26:00 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v2] In-Reply-To: <11s6TR_F_hrI8FgHuuHdH5TtKY7FSG-709rG9WAC_z4=.d85ea284-e973-4cb3-8e6d-badfdb6ff41b@github.com> References: <11s6TR_F_hrI8FgHuuHdH5TtKY7FSG-709rG9WAC_z4=.d85ea284-e973-4cb3-8e6d-badfdb6ff41b@github.com> Message-ID: On Wed, 2 Mar 2022 02:44:18 GMT, Vladimir Kozlov wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Move check for -Xint. > > src/hotspot/share/compiler/compilerDefinitions.cpp line 526: > >> 524: } >> 525: >> 526: // TieredStopAtLevel==0 allocates nmethod space in the code heap with > > I would prefer to not have special case `TieredStopAtLevel==0`. Unless you want to test loom when JIT compilers are disabled. Currently we don't run any tests with such setting. I tested with this setting and it works fine with -XX:+SegmentedCodeCache because the code for segmented code cache only checks Arguments::is_interpreter_only(). This doesn't seem right to me, but the testing also expects this to work. I could fix the tests to not do this. Now I don't know what anyone wants! ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Wed Mar 2 13:55:42 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 13:55:42 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: > In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. > This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7650/files - new: https://git.openjdk.java.net/jdk/pull/7650/files/462c49c2..c362ffa5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7650&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7650&range=01-02 Stats: 17 lines in 4 files changed: 4 ins; 10 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7650.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7650/head:pull/7650 PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Wed Mar 2 14:01:02 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 14:01:02 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 13:55:42 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. I pushed a change to make TieredStopAtLevel=0 also disable SegmentCodeCache and fixed the tests for that also, but am rerunning tests tier1-4, then 5-6. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From redestad at openjdk.java.net Wed Mar 2 14:12:31 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 2 Mar 2022 14:12:31 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v7] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: - Narrow the bottom_type of CountPositivesNode (always results in a positive int value) - Merge master - Resolve merge conflict - Fix TestCountPositives to correctly allow 0 return when expected != len (for now) - aarch64: fix issue with short inputs divisible by wordSize - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step. - Revert micro changes, split out to #7516 - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives - Merge branch 'master' into count_positives - Restore partial vector checks in AVX2 and SSE intrinsic variants - ... and 21 more: https://git.openjdk.java.net/jdk/compare/d4d12ad1...7789349a ------------- Changes: https://git.openjdk.java.net/jdk/pull/7231/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=06 Stats: 527 lines in 29 files changed: 309 ins; 49 del; 169 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Wed Mar 2 14:12:33 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 2 Mar 2022 14:12:33 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Wed, 23 Feb 2022 14:19:20 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits: > > - Resolve merge conflict > - Fix TestCountPositives to correctly allow 0 return when expected != len (for now) > - aarch64: fix issue with short inputs divisible by wordSize > - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step. > - Revert micro changes, split out to #7516 > - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives > - Merge branch 'master' into count_positives > - Restore partial vector checks in AVX2 and SSE intrinsic variants > - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral > - Simplify changes to encodeUTF8 > - ... and 19 more: https://git.openjdk.java.net/jdk/compare/5035bf5e...685795ce > Making the `bottom_type()` of `CountPositivesNode` more precise (`TypeInt::INT` -> `TypeInt::POS`) might help, then. Seems like something we want to do regardless. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From aph at openjdk.java.net Wed Mar 2 14:28:06 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 2 Mar 2022 14:28:06 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 19:12:17 GMT, Claes Redestad wrote: > > @theRealAph , @a74nh or someone familiar with aarch64 code, please review aarch64 changes. > > Note that the aarch64 changes I've put in for now implements `countPositives` to return `0` if there's a negative value anywhere, otherwise `len`. This way we can remove the intrinsic scaffolding for `hasNegatives` once I integrate s390 and ppc Sure. And we don't have to check which of the 8 bytes in a word has its top bit set, so we don't have to do any more work, just return the count when we stopped searching. So there's no possible performance regression on AArch64 as far as I can see. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From duke at openjdk.java.net Wed Mar 2 15:35:06 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Wed, 2 Mar 2022 15:35:06 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> Message-ID: On Tue, 1 Mar 2022 16:43:32 GMT, Boris Ulasevich wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 393: >> >>> 391: assert(CodeCache::find_blob(entry.target()) != NULL, >>> 392: "destination of far call not found in code cache"); >>> 393: assert(CodeCache::is_non_nmethod(entry.target()), "must be a call to the code stub"); >> >> This restricts far calls to be calls of non-nmethod code. > > Yes. In fact the function is used for non-method code calls only. I put an assert here to be check this fact for future code updates. I don't understand why we should restrict uses of `far_call` to calls of non-nmethod code. Could you please explain this? >> src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp line 533: >> >>> 531: address stub = NULL; >>> 532: >>> 533: if (a.codecache_branch_needs_far_jump() >> >> I prefer it to be `a.target_needs_far_jump(dest)`. `codecache_branch` looks like code cache branches need far jumps. It is strange because the code cache is just a storage. It is the code generator has to use far jumps. > > With this patch I do not change trampoline calls. I change far_jump and far_call procedures only. > Instead of far_branches() function we have two functions: > - codecache_branch_needs_far_jump to find if we need a far jump for intra-codecache branches > - codestub_branch_needs_far_jump to find if we need a far branch for codecache-to-nonmethodEntrypoint branch > So in this place I leave codecache_branch_needs_far_jump as exact equivalent of former far_branches() call. I understand the changes. My comment is about names. `MacroAssembler` only needs to know if it needs a far jump. Details "why" are not needed here. We ask `MacroAssembler`. `MacroAssembler` gets `CodeCache` configuration info and checks whether a far jump is needed. >> src/hotspot/share/code/codeCache.cpp line 898: >> >>> 896: } >>> 897: >>> 898: size_t CodeCache::max_distance_to_codestub() { >> >> `max_distance_to_non_nmethod_heap`? >> As this is public API, it sounds strange without the start point. >> If someone changes positions of the heap, would it work as expected? > >> max_distance_to_non_nmethod_heap? >> As this is public API, it sounds strange without the start point. > > Start point is any point in the CodeCache. Will the comment below help? > // maximum distance from any point in the CodeCache to any entry point in the non_nmethod CodeCache segment > This is really too many words for a self-explanatory function name. > >> If someone changes positions of the heap, would it work as expected? > > Sure Can we moved the code to `codestub_branch_needs_far_jump`? It is the only place where the code is used. We might need to make either `get_code_heap` public or `MacroAssembler` a friend of `CodeCache`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From kvn at openjdk.java.net Wed Mar 2 17:36:01 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 2 Mar 2022 17:36:01 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 13:55:42 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Wed Mar 2 18:26:01 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 18:26:01 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 13:55:42 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. Thanks Vladimir. Tiers 1-4 passed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From iklam at openjdk.java.net Wed Mar 2 21:10:24 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 2 Mar 2022 21:10:24 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count Message-ID: This is a simple change (Linux-only) that removes the consideration of Cgroups CPU Shares from the active process count calculation. Of note, this fixes CPU underutilization when Java is executed by Kubernetes without CPU resources limits. Please see the CSR [JDK-8281571](https://bugs.openjdk.java.net/browse/JDK-8281571) for a detailed discussion of the reasons to make this change. To err on the side of caution, we added a temporary (and deprecated) VM flag `-XX:+UseContainerCpuShares` to enable the old behavior. We believe the old behavior is wrong and unnecessary. The plan is to remove the old behavior in JDK 20. The associated flag, `PreferContainerQuotaForCPUCount` is also deprecated. Both flags will be obsoleted in JDK 20. Testing with tiers 1-4, as well as container tests in tier5. ------------- Commit messages: - 8281181: Do not use CPU Shares to compute active processor count Changes: https://git.openjdk.java.net/jdk/pull/7666/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7666&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8281181 Stats: 50 lines in 5 files changed: 32 ins; 0 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/7666.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7666/head:pull/7666 PR: https://git.openjdk.java.net/jdk/pull/7666 From dholmes at openjdk.java.net Wed Mar 2 22:38:01 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Mar 2022 22:38:01 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 13:55:42 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. src/hotspot/share/code/codeCache.cpp line 371: > 369: // No segmentation: use a single code heap > 370: return (code_blob_type == CodeBlobType::All); > 371: } else if (CompilerConfig::is_interpreter_only()) { Doesn't this introduce a very subtle change in behaviour in relation to the handling of `MethodNonProfiled`? @vnkozlov please confirm this is what you intended - I don't know the implications. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From lucy at openjdk.java.net Wed Mar 2 22:44:24 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 2 Mar 2022 22:44:24 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 14:06:10 GMT, Claes Redestad wrote: >> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits: >> >> - Resolve merge conflict >> - Fix TestCountPositives to correctly allow 0 return when expected != len (for now) >> - aarch64: fix issue with short inputs divisible by wordSize >> - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step. >> - Revert micro changes, split out to #7516 >> - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives >> - Merge branch 'master' into count_positives >> - Restore partial vector checks in AVX2 and SSE intrinsic variants >> - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral >> - Simplify changes to encodeUTF8 >> - ... and 19 more: https://git.openjdk.java.net/jdk/compare/5035bf5e...685795ce > >> > > Making the `bottom_type()` of `CountPositivesNode` more precise (`TypeInt::INT` -> `TypeInt::POS`) might help, then. Seems like something we want to do regardless. @cl4es Looks like you forgot to remove the "@IntrinsicCandidate" annotation for has_negatives. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From dholmes at openjdk.java.net Wed Mar 2 22:48:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Mar 2022 22:48:06 GMT Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 10:23:25 GMT, Andrew Haley wrote: >> Zero is a CPU-agnostic interpreter build, but our builds are inherently CPU-based, so Zero has to represent that "don't care" CPU because it has to replace some CPU specific code with Zero's C code. But then you have to build other CPU-specific parts of the JDK even when using Zero. Hence the approach of using CPU ifdefs combined with a check for Zero. >> Yes it is awkward and confusing and mistakes can creep in. But I don't think you will come up with anything better for this situation. And the AARCH64_PORT_ONLY is certainly not better IMO. > >> Zero is a CPU-agnostic interpreter build, but our builds are inherently CPU-based, so Zero has to represent that "don't care" CPU because it has to replace some CPU specific code with Zero's C code. But then you have to build other CPU-specific parts of the JDK even when using Zero. Hence the approach of using CPU ifdefs combined with a check for Zero. Yes it is awkward and confusing and mistakes can creep in. But I don't think you will come up with anything better for this situation. And the AARCH64_PORT_ONLY is certainly not better IMO. > > Can you explain why it's not better? I don't want to waste your time by prolonging discussions unnecessarily, but it seems to me that this nomenclature conveys exactly what we mean: not so much running on a particular CPU, but a particular port. > There's a matrix of possible combinations of instruction set architecture and port, with the ports ranging from highly-tuned and customized to completely portable, with space in between. @theRealAph the word "port" may have a very specific meaning to you such that this macro conveys what you expect, but it doesn't to me. You have to know what is not considered part of a "port" when building Aarch64 - and that is simply when building Zero. So to me Zero should be part of any conditionalisation. e.g. (if macros work this way) `NOT_ZERO(AARCH64_ONLY(ret_pc = pauth_strip_pointer(ret_pc);))` or the ifdefs @shipilev suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/7633 From coleenp at openjdk.java.net Wed Mar 2 22:54:11 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Mar 2022 22:54:11 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 22:34:24 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. > > src/hotspot/share/code/codeCache.cpp line 371: > >> 369: // No segmentation: use a single code heap >> 370: return (code_blob_type == CodeBlobType::All); >> 371: } else if (CompilerConfig::is_interpreter_only()) { > > Doesn't this introduce a very subtle change in behaviour in relation to the handling of `MethodNonProfiled`? > @vnkozlov please confirm this is what you intended - I don't know the implications. According to my conversation with Rickard, TierStopAtLevel=0 is the same as -Xint. If TieredStopAtLevel=0, the compiler won't allocate nmethods in the MethodProfiled and MethodNonProfiled areas anyway, so this won't allocate them now. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From sviswanathan at openjdk.java.net Wed Mar 2 23:27:08 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 2 Mar 2022 23:27:08 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: References: <1K0c0y8K8bVNJEFMyTQSxwdgJlx9E2N8uhHC7O9sfyM=.c4ead8b5-abe0-42f4-ae10-aa24425eb75d@github.com> <8mhsd-DL1IccFiqrRigKdck8OJg79sjKgaYXrHc4zwY=.c92cb7f5-8e54-42ab-84f1-9cfa1ce76779@github.com> Message-ID: On Sat, 26 Feb 2022 03:38:32 GMT, Quan Anh Mai wrote: >> I believe the indefinite value should be 2^(w - 1) (a.k.a 0x80000000) and the documentation is typoed. If you look at `cvtss2si`, the indefinite value is also written as 2^w - 1 but yet in `MacroAssembler::convert_f2i` we compare it with 0x80000000. In addition, choosing -1 as an indefinite value is weird enough and to complicate it as 2^w - 1 is really unusual. > > `MacroAssembler::convert_f2i` > > https://github.com/openjdk/jdk/blob/c5c6058fd57d4b594012035eaf18a57257f4ad85/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L8919 @jatin-bhateja @merykitty You are right, on overflow we observe 2^(w - 1) i.e. 0x8000 0000 so using vector_float_signflip() is correct. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From sviswanathan at openjdk.java.net Wed Mar 2 23:32:05 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 2 Mar 2022 23:32:05 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v11] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 02:44:41 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Removing +LogCompilation flag. Marked as reviewed by sviswanathan (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From sviswanathan at openjdk.java.net Wed Mar 2 23:32:06 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 2 Mar 2022 23:32:06 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: References: Message-ID: On Sat, 26 Feb 2022 04:55:08 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Adding descriptive comments. > > As per SDM, if post conversion a floating point number is non-representable in destination format e.g. a floating point value 3.4028235E10 post integer conversion will overflow the value range of integer primitive type, hence a -0.0 value or 0x80000000 is returned here. Similarly for +/- NaN and +/-Inf post conversion value returns is -0.0. All these cases i.e. post conversion non-representable floating point values and NaN/Inf values are handled in a special manner where algorithm first performs an unordered comparison b/w original source value and returns a 0 in case of NaN, this weeds out the NaN case and for rest of the special values we check the MSB bit of the source and either return an Integer.MAX_VALUE for +ve numbers or a Integer.MIN_VALUE to adhere to the semantics of Math.round API. > > Existing tests were enhanced to cover various special cases (NaN/Inf/+ve/-ve value/values which may be inexact after adding 0.5/ values which post conversion overflow integer value range). @jatin-bhateja The patch looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From sviswanathan at openjdk.java.net Wed Mar 2 23:32:07 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 2 Mar 2022 23:32:07 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: References: Message-ID: On Sat, 26 Feb 2022 01:07:47 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Adding descriptive comments. > > src/hotspot/cpu/x86/x86.ad line 7295: > >> 7293: __ vector_round_double_evex($dst$$XMMRegister, $src$$XMMRegister, $xtmp1$$XMMRegister, >> 7294: $xtmp2$$XMMRegister, $ktmp1$$KRegister, $ktmp2$$KRegister, >> 7295: ExternalAddress(vector_double_signflip()), new_mxcsr, $scratch$$Register, vlen_enc); > > The vector_double_signflip() here should be replaced by vector_all_bits_set(). > vcvtpd2qq description: > If a converted result cannot be represented in the destination > format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value > (2w-1, where w represents the number of bits in the destination format) is returned. The overflow value observed is 2^(w-1) so using vector_double_signflip() is correct, please ignore this comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From kvn at openjdk.java.net Thu Mar 3 00:19:05 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 3 Mar 2022 00:19:05 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 22:50:22 GMT, Coleen Phillimore wrote: >> src/hotspot/share/code/codeCache.cpp line 371: >> >>> 369: // No segmentation: use a single code heap >>> 370: return (code_blob_type == CodeBlobType::All); >>> 371: } else if (CompilerConfig::is_interpreter_only()) { >> >> Doesn't this introduce a very subtle change in behaviour in relation to the handling of `MethodNonProfiled`? >> @vnkozlov please confirm this is what you intended - I don't know the implications. > > According to my conversation with Rickard, TierStopAtLevel=0 is the same as -Xint. If TieredStopAtLevel=0, the compiler won't allocate nmethods in the MethodProfiled and MethodNonProfiled areas anyway, so this won't allocate them now. I think this code was missed during Igor's V. changes which introduced `CompilerConfig::is_interpreter_only()`. Coleen is correct, with `TieredStopAtLevel=0` all JIT compilers are disabled. `MethodNonProfiled` is codeheap for tier1 (C1) and tier4 (C2) compiled nmethods. It does not make sense to have such codeheap when compilation disabled. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Thu Mar 3 01:59:05 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Mar 2022 01:59:05 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: <8GAjDf1kxFdAgUNgadLvW-g40kT4vg9H_hPCpoUNVE4=.112d18c0-033c-407f-b4fd-a0edba3515a1@github.com> On Thu, 3 Mar 2022 00:15:24 GMT, Vladimir Kozlov wrote: >> According to my conversation with Rickard, TierStopAtLevel=0 is the same as -Xint. If TieredStopAtLevel=0, the compiler won't allocate nmethods in the MethodProfiled and MethodNonProfiled areas anyway, so this won't allocate them now. > > I think this code was missed during Igor's V. changes which introduced `CompilerConfig::is_interpreter_only()`. > Coleen is correct, with `TieredStopAtLevel=0` all JIT compilers are disabled. `MethodNonProfiled` is codeheap for tier1 (C1) and tier4 (C2) compiled nmethods. It does not make sense to have such codeheap when compilation disabled. Okay, so does that mean this block: } else { // No TieredCompilation: we only need the non-nmethod and non-profiled code heap return (code_blob_type == CodeBlobType::NonNMethod) || (code_blob_type == CodeBlobType::MethodNonProfiled); } is actually unreachable? ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Thu Mar 3 02:35:59 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Mar 2022 02:35:59 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 20:01:46 GMT, Ioi Lam wrote: > This is a simple change (Linux-only) that removes the consideration of Cgroups CPU Shares from the active process count calculation. Of note, this fixes CPU underutilization when Java is executed by Kubernetes without CPU resources limits. > > Please see the CSR [JDK-8281571](https://bugs.openjdk.java.net/browse/JDK-8281571) for a detailed discussion of the reasons to make this change. > > To err on the side of caution, we added a temporary (and deprecated) VM flag `-XX:+UseContainerCpuShares` to enable the old behavior. We believe the old behavior is wrong and unnecessary. The plan is to remove the old behavior in JDK 20. > > The associated flag, `PreferContainerQuotaForCPUCount` is also deprecated. Both flags will be obsoleted in JDK 20. > > Testing with tiers 1-4, as well as container tests in tier5. Hi Ioi, Generally looks good. A couple of nits below. The deprecated flags should be added to the VMDeprecatedOptions test. Thanks, David src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 500: > 498: > 499: // It's not a good idea to use cpu_shares() to limit the number > 500: // of CPUs used by the JVM. See JDK-8281571. I suggest using JDK-8281181 rather than the CSR issue. test/hotspot/jtreg/containers/docker/TestCPUAwareness.java line 104: > 102: > 103: // OLD = use the deprecated -XX:+UseContainerCpuShares flag, which > 104: // will be removed in the next JDK release. See JDK-8281571. I suggest using JDK-8281181 rather than the CSR issue. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7666 From kvn at openjdk.java.net Thu Mar 3 03:42:59 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 3 Mar 2022 03:42:59 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: <8GAjDf1kxFdAgUNgadLvW-g40kT4vg9H_hPCpoUNVE4=.112d18c0-033c-407f-b4fd-a0edba3515a1@github.com> References: <8GAjDf1kxFdAgUNgadLvW-g40kT4vg9H_hPCpoUNVE4=.112d18c0-033c-407f-b4fd-a0edba3515a1@github.com> Message-ID: On Thu, 3 Mar 2022 01:55:30 GMT, David Holmes wrote: >> I think this code was missed during Igor's V. changes which introduced `CompilerConfig::is_interpreter_only()`. >> Coleen is correct, with `TieredStopAtLevel=0` all JIT compilers are disabled. `MethodNonProfiled` is codeheap for tier1 (C1) and tier4 (C2) compiled nmethods. It does not make sense to have such codeheap when compilation disabled. > > Okay, so does that mean this block: > > } else { > // No TieredCompilation: we only need the non-nmethod and non-profiled code heap > return (code_blob_type == CodeBlobType::NonNMethod) || > (code_blob_type == CodeBlobType::MethodNonProfiled); > } > > is actually unreachable? It is reachable for `-XX:-TieredCompilation` when only C2 is used. Or when only tier1 C1 is used with `-XX:TieredStopAtLevel=1`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Thu Mar 3 03:55:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Mar 2022 03:55:06 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: <8GAjDf1kxFdAgUNgadLvW-g40kT4vg9H_hPCpoUNVE4=.112d18c0-033c-407f-b4fd-a0edba3515a1@github.com> Message-ID: On Thu, 3 Mar 2022 03:40:03 GMT, Vladimir Kozlov wrote: >> Okay, so does that mean this block: >> >> } else { >> // No TieredCompilation: we only need the non-nmethod and non-profiled code heap >> return (code_blob_type == CodeBlobType::NonNMethod) || >> (code_blob_type == CodeBlobType::MethodNonProfiled); >> } >> >> is actually unreachable? > > It is reachable for `-XX:-TieredCompilation` when only C2 is used. Or when only tier1 C1 is used with `-XX:TieredStopAtLevel=1`. Ah I see. Thanks for clarifying @vnkozlov ! ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From dholmes at openjdk.java.net Thu Mar 3 03:55:05 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Mar 2022 03:55:05 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 13:55:42 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From jbhateja at openjdk.java.net Thu Mar 3 05:46:09 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 3 Mar 2022 05:46:09 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v2] In-Reply-To: References: Message-ID: <3JoM4khNMz85gwfyxZeBNxJCZ_B7826cc-iO4pHtTJM=.5b21a96e-b2f5-4093-a763-eec2b6d77a2e@github.com> On Wed, 19 Jan 2022 22:09:26 GMT, Joe Darcy wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Adding a test for scalar intrinsification. > > The testing for this PR doesn't look adequate to me. I don't see any testing for the values where the behavior of round has been redefined at points in the last decade. See JDK-8010430 and JDK-6430675, both of which have regression tests in the core libs area. Thanks. Hi @jddarcy , can you kindly validate your feedback, it has been incorporated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From iklam at openjdk.java.net Thu Mar 3 07:09:34 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 3 Mar 2022 07:09:34 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count [v2] In-Reply-To: References: Message-ID: > This is a simple change (Linux-only) that removes the consideration of Cgroups CPU Shares from the active process count calculation. Of note, this fixes CPU underutilization when Java is executed by Kubernetes without CPU resources limits. > > Please see the CSR [JDK-8281571](https://bugs.openjdk.java.net/browse/JDK-8281571) for a detailed discussion of the reasons to make this change. > > To err on the side of caution, we added a temporary (and deprecated) VM flag `-XX:+UseContainerCpuShares` to enable the old behavior. We believe the old behavior is wrong and unnecessary. The plan is to remove the old behavior in JDK 20. > > The associated flag, `PreferContainerQuotaForCPUCount` is also deprecated. Both flags will be obsoleted in JDK 20. > > Testing with tiers 1-4, as well as container tests in tier5. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7666/files - new: https://git.openjdk.java.net/jdk/pull/7666/files/9ffd57d0..1e0f289f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7666&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7666&range=00-01 Stats: 4 lines in 3 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7666.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7666/head:pull/7666 PR: https://git.openjdk.java.net/jdk/pull/7666 From iklam at openjdk.java.net Thu Mar 3 07:09:34 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 3 Mar 2022 07:09:34 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count [v2] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 02:33:00 GMT, David Holmes wrote: > Hi Ioi, > > Generally looks good. A couple of nits below. > > The deprecated flags should be added to the VMDeprecatedOptions test. > > Thanks, David Hi David, thanks for the review. I've updated the patch as you suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/7666 From sgehwolf at openjdk.java.net Thu Mar 3 09:30:07 2022 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Thu, 3 Mar 2022 09:30:07 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count [v2] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 07:09:34 GMT, Ioi Lam wrote: >> This is a simple change (Linux-only) that removes the consideration of Cgroups CPU Shares from the active process count calculation. Of note, this fixes CPU underutilization when Java is executed by Kubernetes without CPU resources limits. >> >> Please see the CSR [JDK-8281571](https://bugs.openjdk.java.net/browse/JDK-8281571) for a detailed discussion of the reasons to make this change. >> >> To err on the side of caution, we added a temporary (and deprecated) VM flag `-XX:+UseContainerCpuShares` to enable the old behavior. We believe the old behavior is wrong and unnecessary. The plan is to remove the old behavior in JDK 20. >> >> The associated flag, `PreferContainerQuotaForCPUCount` is also deprecated. Both flags will be obsoleted in JDK 20. >> >> Testing with tiers 1-4, as well as container tests in tier5. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora comments LGTM ------------- Marked as reviewed by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7666 From aph at openjdk.java.net Thu Mar 3 12:01:09 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 3 Mar 2022 12:01:09 GMT Subject: RFR: 8279143: Undefined behaviours in globalDefinitions.hpp [v4] In-Reply-To: <0UxP6wtOa77IpnfiL70N8Q4lwCX4KWI2GCXUS43NLYg=.aa1e3f25-d831-4364-b1a1-424b0629802c@github.com> References: <4Aa44MvoFLOsvLI2sNGwnosaW8wvlotamHdjw8FKwL4=.1d9f2ccb-454a-40b2-8232-4311d41c0831@github.com> <0UxP6wtOa77IpnfiL70N8Q4lwCX4KWI2GCXUS43NLYg=.aa1e3f25-d831-4364-b1a1-424b0629802c@github.com> Message-ID: On Wed, 5 Jan 2022 01:17:53 GMT, Kim Barrett wrote: >> Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Merge branch 'master' into undefinedBehaviour >> - update copyright >> - typo >> - clean >> - Merge branch 'master' into undefinedBehaviour >> - Merge branch 'master' of github.com:MeryKitty/jdk into undefinedBehaviour >> - implementation limits >> - const reference >> - words not need to be initialized >> - undefined behaviour in globalDefinitions.hpp > > src/hotspot/share/utilities/globalDefinitions.hpp line 617: > >> 615: std::is_trivially_copy_assignable(), "implementation limits"); >> 616: T to; >> 617: memcpy(&to, &from, sizeof(T)); > > During the review of JDK-8145096 it was found that some compilers produce > wretched code for these kinds of memcpy uses, even at fairly high optimization > levels. (I don't know if we still care about those compilers. Unfortunately I > don't remember which ones they were, other than gcc/clang/VS all being good.) > > While using the so-called "union trick" is technically undefined behavior, it > is a technique that is known to be widely and well supported and produces good > code, at least for the cases where it is being used in HotSpot. In some cases, > such as gcc (and I think Visual Studio, though can't find a reference right > now), this behavior is documented. > > Rather than adding a partial bit_cast (or moving it from elsewhere), we should > be using our existing PrimitiveConversions::cast > (metaprogramming/primitiveConversions.hpp). That has the small difficulty of a > circular include dependency with globalDefintions.hpp. That can be fixed by > moving the various jfoo_cast functions elsewhere (either to > primitiveConversions.hpp or to a new file; I might prefer the latter (along > with the Translate specializations in primitiveConversions), > moving these relatively infrequently used utilities to their own dedicated > location). That also reduces the content of globalDefinitions.hpp, which IMO > is too much of a random dumping ground. >From replies to what must by not be a GCC FAQ, type punning through a union like this is explicitly permitted: inline jlong jlong_cast (jdouble x) { DoubleLongConv u; u.x = x; return u.l; } But this isn't: inline jlong jlong_cast (jdouble x) { return ((DoubleLongConv*)&x)->l; } Not that it matters to HotSpot, because we use `-fno-strict-aliasing`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6930 From redestad at openjdk.java.net Thu Mar 3 12:04:47 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 3 Mar 2022 12:04:47 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Document that it's allowed for implementations to return values less than the exact count (iff there are negative bytes) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/3207c098..85be36ae Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=07-08 Stats: 11 lines in 2 files changed: 5 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From bulasevich at openjdk.java.net Thu Mar 3 12:12:02 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 3 Mar 2022 12:12:02 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> Message-ID: On Mon, 28 Feb 2022 18:37:11 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: >> >> - fix name: is_non_nmethod, adding target_needs_far_branch func >> - change codecache segments order: nonprofiled-nonmethod-profiled >> increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M > > src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 55: > >> 53: Label l; >> 54: __ ldr(rscratch2, l); >> 55: __ far_jump(ExternalAddress(entry_point), NULL, rscratch1, true); > > This complicates `assemble_ic_buffer_code`. You need to know `far_jump` implementation, especially the generation of NOPs. I understand why we need those NOPs. > Do we have calls of non-nmethod code here? Yes, there are entry points from both non_method and method segments. > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4379: > >> 4377: postcond(pc() == badAddress); >> 4378: return NULL; >> 4379: } > > I believe replacing `trampoline_call` by `far_call` should be a separate PR. Ok. I remove this part of the change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From bulasevich at openjdk.java.net Thu Mar 3 12:12:02 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 3 Mar 2022 12:12:02 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> Message-ID: On Wed, 2 Mar 2022 15:14:57 GMT, Evgeny Astigeevich wrote: >> Yes. In fact the function is used for non-method code calls only. I put an assert here to be check this fact for future code updates. > > I don't understand why we should restrict uses of `far_call` to calls of non-nmethod code. Could you please explain this? I wanted to avoid the untested code paths. You are right. Let me change it to work the same way with far_jump impl: if (target_needs_far_branch).. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From redestad at openjdk.java.net Thu Mar 3 12:14:10 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 3 Mar 2022 12:14:10 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v6] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 14:06:10 GMT, Claes Redestad wrote: >> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits: >> >> - Resolve merge conflict >> - Fix TestCountPositives to correctly allow 0 return when expected != len (for now) >> - aarch64: fix issue with short inputs divisible by wordSize >> - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step. >> - Revert micro changes, split out to #7516 >> - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives >> - Merge branch 'master' into count_positives >> - Restore partial vector checks in AVX2 and SSE intrinsic variants >> - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral >> - Simplify changes to encodeUTF8 >> - ... and 19 more: https://git.openjdk.java.net/jdk/compare/5035bf5e...685795ce > >> > > Making the `bottom_type()` of `CountPositivesNode` more precise (`TypeInt::INT` -> `TypeInt::POS`) might help, then. Seems like something we want to do regardless. > @cl4es Looks like you forgot to remove the "@IntrinsicCandidate" annotation for has_negatives. I was going back and forth on this since the annotation might be used as a useful marker by other JVM implementors, so wasn't sure if removing the annotation would need to go through a deprecation cycle. I've removed it now; let's see if there are any objections to that. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From bulasevich at openjdk.java.net Thu Mar 3 12:19:59 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 3 Mar 2022 12:19:59 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> Message-ID: On Wed, 2 Mar 2022 15:28:29 GMT, Evgeny Astigeevich wrote: >> With this patch I do not change trampoline calls. I change far_jump and far_call procedures only. >> Instead of far_branches() function we have two functions: >> - codecache_branch_needs_far_jump to find if we need a far jump for intra-codecache branches >> - codestub_branch_needs_far_jump to find if we need a far branch for codecache-to-nonmethodEntrypoint branch >> So in this place I leave codecache_branch_needs_far_jump as exact equivalent of former far_branches() call. > > I understand the changes. My comment is about names. `MacroAssembler` only needs to know if it needs a far jump. Details "why" are not needed here. > We ask `MacroAssembler`. `MacroAssembler` gets `CodeCache` configuration info and checks whether a far jump is needed. Ok. For the sake of formal logic here, let me introduce trampoline_needs_far_jump() function: if (a.trampoline_needs_far_jump() && ..) { stub = a.emit_trampoline_stub(dest); } if (stub == NULL) { // If we generated no stub, patch this call directly to dest. set_destination(dest); >>> max_distance_to_non_nmethod_heap? >>> As this is public API, it sounds strange without the start point. >> >> Start point is any point in the CodeCache. Will the comment below help? >> // maximum distance from any point in the CodeCache to any entry point in the non_nmethod CodeCache segment >> This is really too many words for a self-explanatory function name. >> >>> If someone changes positions of the heap, would it work as expected? >> >> Sure > > Can we moved the code to `codestub_branch_needs_far_jump`? It is the only place where the code is used. > We might need to make either `get_code_heap` public or `MacroAssembler` a friend of `CodeCache`. Yes, it is the only place. Though I don't do this for encapsulation reasons: I think it's the internal details of the code cache if it is segmented and what the segment boundaries are. Right? ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From bulasevich at openjdk.java.net Thu Mar 3 12:31:39 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 3 Mar 2022 12:31:39 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v3] In-Reply-To: References: Message-ID: > Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. > > In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. > > As a side effect, the performance of some tests is slightly improved: > ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` > > Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7517/files - new: https://git.openjdk.java.net/jdk/pull/7517/files/8f1a8c90..5f0fe37c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=01-02 Stats: 75 lines in 3 files changed: 54 ins; 14 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/7517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517 PR: https://git.openjdk.java.net/jdk/pull/7517 From aph at openjdk.java.net Thu Mar 3 12:34:07 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 3 Mar 2022 12:34:07 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 12:04:47 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Document that it's allowed for implementations to return values less than the exact count (iff there are negative bytes) > * There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). I'm not sure that we can disregard such cases. You're probably right, but it'd be interesting to know the actual cause of the problem, perhaps with perfasm. Or do you know already? ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Thu Mar 3 12:49:06 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 3 Mar 2022 12:49:06 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: Message-ID: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> On Thu, 3 Mar 2022 12:30:29 GMT, Andrew Haley wrote: > > ``` > > * There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > ``` > > I'm not sure that we can disregard such cases. You're probably right, but it'd be interesting to know the actual cause of the problem, perhaps with perfasm. Or do you know already? Still looking into it in detail, but the shape of the code at the bytecode level is different, so generated code looks quite different, which means the logical branches can be laid out differently, or just placed in a different order. The 5% in this particular test could easily be due to getting a different order of the tests for determining if a byte is ascii or latin1. I'm also in the process of measuring in detail after narrowing down the `bottom_type()` of the intrinsic and a few other recent fixes that could affect code generation in C2. Will also benchmark on aarch64 and report back. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From dholmes at openjdk.java.net Thu Mar 3 12:54:03 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Mar 2022 12:54:03 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count [v2] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 07:09:34 GMT, Ioi Lam wrote: >> This is a simple change (Linux-only) that removes the consideration of Cgroups CPU Shares from the active process count calculation. Of note, this fixes CPU underutilization when Java is executed by Kubernetes without CPU resources limits. >> >> Please see the CSR [JDK-8281571](https://bugs.openjdk.java.net/browse/JDK-8281571) for a detailed discussion of the reasons to make this change. >> >> To err on the side of caution, we added a temporary (and deprecated) VM flag `-XX:+UseContainerCpuShares` to enable the old behavior. We believe the old behavior is wrong and unnecessary. The plan is to remove the old behavior in JDK 20. >> >> The associated flag, `PreferContainerQuotaForCPUCount` is also deprecated. Both flags will be obsoleted in JDK 20. >> >> Testing with tiers 1-4, as well as container tests in tier5. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora comments Updates look good. Please file an issue to obsolete these flags in JDK 20 (they will be expired en-masse in JDK 21). Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7666 From lucy at openjdk.java.net Thu Mar 3 13:25:03 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Thu, 3 Mar 2022 13:25:03 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> References: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> Message-ID: On Thu, 3 Mar 2022 12:45:51 GMT, Claes Redestad wrote: >>> * There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). >> >> I'm not sure that we can disregard such cases. You're probably right, but it'd be interesting to know the actual cause of the problem, perhaps with perfasm. Or do you know already? > >> > ``` >> > * There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). >> > ``` >> >> I'm not sure that we can disregard such cases. You're probably right, but it'd be interesting to know the actual cause of the problem, perhaps with perfasm. Or do you know already? > > Still looking into it in detail, but the shape of the code at the bytecode level is different, so generated code looks quite different, which means the logical branches can be laid out differently, or just placed in a different order. The 5% in this particular test could easily be due to getting a different order of the tests for determining if a byte is ascii or latin1. > > I'm also in the process of measuring in detail after narrowing down the `bottom_type()` of the intrinsic and a few other recent fixes that could affect code generation in C2. Will also benchmark on aarch64 and report back. > > @cl4es Looks like you forgot to remove the "@IntrinsicCandidate" annotation for has_negatives. > > I was going back and forth ... Well, it just didn't build. With the annotation being present, you also need an intrinsic implementation. That's what the error message is saying... ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From coleenp at openjdk.java.net Thu Mar 3 13:26:08 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 3 Mar 2022 13:26:08 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: <8GAjDf1kxFdAgUNgadLvW-g40kT4vg9H_hPCpoUNVE4=.112d18c0-033c-407f-b4fd-a0edba3515a1@github.com> Message-ID: On Thu, 3 Mar 2022 03:52:04 GMT, David Holmes wrote: >> It is reachable for `-XX:-TieredCompilation` when only C2 is used. Or when only tier1 C1 is used with `-XX:TieredStopAtLevel=1`. > > Ah I see. Thanks for clarifying @vnkozlov ! -XX:TieredStopAtLevel=1 is the old -client switch. Thanks Vladimir for confirming. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Thu Mar 3 13:26:08 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 3 Mar 2022 13:26:08 GMT Subject: RFR: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint [v3] In-Reply-To: References: Message-ID: <2AYDq3e7A_na-i_59oLgzw012Ex29KVEWS8bmx-VEKw=.bc7c817d-c40e-463c-8075-c95889d1a27c@github.com> On Wed, 2 Mar 2022 13:55:42 GMT, Coleen Phillimore wrote: >> In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. >> This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make TieredStopAtLevel=0 also ignore SegmentedCodeHeap and not create a nmethod heaps, and fix tests accordingly. Thanks David. ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From coleenp at openjdk.java.net Thu Mar 3 13:26:09 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 3 Mar 2022 13:26:09 GMT Subject: Integrated: 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 20:50:23 GMT, Coleen Phillimore wrote: > In Loom, when using -Xint, the +SegmentedCodeCache option cannot be used because it doesn't generate a code heap for nmethods, and in loom the compiler needs to generate an nmethod for Continuation.enterSpecial even with -Xint. > This change is @rickard 's loom change with the tests fixed so they pass for it. One tests for the new warning message and code cache usage, and the others remove the Xint tests. > Tested with tier1-4. This pull request has now been integrated. Changeset: 7822cbce Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/7822cbce10e0c0c6f9bf521faebc89a0af20734e Stats: 34 lines in 5 files changed: 8 ins; 15 del; 11 mod 8276711: compiler/codecache/cli tests failing when SegmentedCodeCache used with -Xint Reviewed-by: kvn, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7650 From redestad at openjdk.java.net Thu Mar 3 13:35:02 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 3 Mar 2022 13:35:02 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> Message-ID: On Thu, 3 Mar 2022 13:21:36 GMT, Lutz Schmidt wrote: > Well, it just didn't build. With the annotation being present, you also need an intrinsic implementation. That's what the error message is saying... Doh, I had no idea the presence of `@IntrinsicCandidate` was mandating the VM has an intrinsic implementation these days (Why?! Not much of a _candidate_ if it's required..). I also seem to have built-and-tested locally without that patch properly applied. Sorry for the noise, should be fixed now. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From aph at openjdk.java.net Thu Mar 3 14:40:58 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 3 Mar 2022 14:40:58 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> Message-ID: On Thu, 3 Mar 2022 13:31:35 GMT, Claes Redestad wrote: > > Well, it just didn't build. With the annotation being present, you also need an intrinsic implementation. That's what the error message is saying... > > Doh, I had no idea the presence of `@IntrinsicCandidate` was mandating the VM has an intrinsic implementation these days (Why?! Not much of a _candidate_ if it's required..). I don't think the intrinsic has to be implemented on every target, but AFAICR it does have to be declared as an intrinsic in HotSpot. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From lucy at openjdk.java.net Thu Mar 3 14:47:02 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Thu, 3 Mar 2022 14:47:02 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> Message-ID: On Thu, 3 Mar 2022 14:37:34 GMT, Andrew Haley wrote: > > > Well, it just didn't build. With the annotation being present, you also need an intrinsic implementation. That's what the error message is saying... > > > > > > Doh, I had no idea the presence of `@IntrinsicCandidate` was mandating the VM has an intrinsic implementation these days (Why?! Not much of a _candidate_ if it's required..). > > I don't think the intrinsic has to be implemented on every target, but AFAICR it does have to be declared as an intrinsic in HotSpot. Yes, sorry for the imprecise wording: the declaration must be there. Use of the intrinsic is then controlled via has_match_rule(opcode) and match_rule_supported(opcode). ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Thu Mar 3 14:51:01 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 3 Mar 2022 14:51:01 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v9] In-Reply-To: References: <-aKM_9WIEXh3vMlugSMg2Sprs5KMCZKh_3zoD5uaBvw=.7a096135-f215-4a52-8660-73dddef19c38@github.com> Message-ID: On Thu, 3 Mar 2022 14:43:52 GMT, Lutz Schmidt wrote: > I don't think the intrinsic has to be implemented on every target, but AFAICR it does have to be declared as an intrinsic in HotSpot. Yeah, I got confused. To me it looks like a declaration of intent, and thought the only strict requirement was that HotSpot can't define an intrinsic if there's no annotation. It also seems odd to hardwire it to what HotSpot does - wasn't the purpose of renaming from `@HotSpotIntrinsicCandidate` to decouple things a bit better? Either way.. I'm re-running tests and benchmarks and seeing if I can improve the aarch64 impl to return a decent approximation rather than always-0 when there are negative bytes. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From amenkov at openjdk.java.net Thu Mar 3 15:14:38 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Thu, 3 Mar 2022 15:14:38 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes Message-ID: JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. We still have some incorrect code in CP merge during class redefinition. rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. tested: - test/jdk/java/lang/instrument - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses ------------- Commit messages: - JDK-8282241 Changes: https://git.openjdk.java.net/jdk/pull/7676/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7676&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282241 Stats: 249 lines in 2 files changed: 239 ins; 7 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7676.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7676/head:pull/7676 PR: https://git.openjdk.java.net/jdk/pull/7676 From duke at openjdk.java.net Thu Mar 3 15:22:07 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 3 Mar 2022 15:22:07 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> Message-ID: On Thu, 3 Mar 2022 12:16:21 GMT, Boris Ulasevich wrote: >> Can we moved the code to `codestub_branch_needs_far_jump`? It is the only place where the code is used. >> We might need to make either `get_code_heap` public or `MacroAssembler` a friend of `CodeCache`. > > Yes, it is the only place. Though I don't do this for encapsulation reasons: I think it's the internal details of the code cache if it is segmented and what the segment boundaries are. Right? Ok, let's keep it here. Could we change the name from `max_distance_to_codestub` to `max_distance_to_non_nmethod` to make clear the function covers all non_nmethod code? ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From duke at openjdk.java.net Thu Mar 3 15:37:58 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 3 Mar 2022 15:37:58 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> <5Gxh0VsPuENs_0XY0WfcWxDBmFxx3769_sm9HXwgCqI=.a08c1e02-866a-4127-925e-8044ce6444ee@github.com> Message-ID: On Thu, 3 Mar 2022 12:14:51 GMT, Boris Ulasevich wrote: >> I understand the changes. My comment is about names. `MacroAssembler` only needs to know if it needs a far jump. Details "why" are not needed here. >> We ask `MacroAssembler`. `MacroAssembler` gets `CodeCache` configuration info and checks whether a far jump is needed. > > Ok. For the sake of formal logic here, let me introduce trampoline_needs_far_jump() function: > > if (a.trampoline_needs_far_jump() && ..) { > stub = a.emit_trampoline_stub(dest); > } > if (stub == NULL) { > // If we generated no stub, patch this call directly to dest. > set_destination(dest); According to the comments of `trampoline_jump`: // Generate a trampoline for a branch to dest. If there's no need for a // trampoline, simply patch the call directly to dest. `trampoline_jump` is a jump or a far jump via a trampoline. It is not that a trampoline needs either a jump or a far jump. What about: if (a.jump_needs_trampoline() && ..) { stub = a.emit_trampoline_stub(dest); } Another variant: if (a.is_trampoline_needed() && ..) { stub = a.emit_trampoline_stub(dest); } ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From coleenp at openjdk.java.net Thu Mar 3 23:09:59 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 3 Mar 2022 23:09:59 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 15:07:05 GMT, Alex Menkov wrote: > JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. > We still have some incorrect code in CP merge during class redefinition. > > rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. > > In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. > > tested: > - test/jdk/java/lang/instrument > - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses Thank you for fixing this. It took me a while to figure this out again and it looks correct. Can you use the new redefine class test framework for the test instead? test/jdk/java/lang/instrument/RetransformGenericSignatureTest.java line 1: > 1: /* Can you write this test in the framework where the newer RedefineClasses test are in test/hotspot/jtreg/serviceability/jvmti/RedefineClasses ? You can just write the new class as a string that the inMemory compiler compiles for you. It's a lot simpler and doesn't use a shell script at all. ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7676 From yyang at openjdk.java.net Fri Mar 4 02:51:07 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 02:51:07 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 11 Feb 2022 06:49:25 GMT, David Holmes wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2081: >> >>> 2079: _st->print(INTPTR_FORMAT " ", p2i(k)); >>> 2080: // klass size >>> 2081: _st->print("%-4d ", k->size()); >> >> Should be `%4d` so that the numbers are aligned correctly. > > This issue seem still outstanding. Current: $./jcmd 83908 VM.classes|head -10 83908: KlassAddr Size State Flags ClassName 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 0x0000000800df8000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 0x0000000800de4400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 0x0000000800de4000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4000 0x0000000800dc8800 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800dc8800 0x0000000800dc8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800dc8400 0x0000000800dc8000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800dc8000 0x0000000800db9800 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800db9800 After using "%4d": $./jcmd 75481 VM.classes|head 75481: KlassAddr Size State Flags ClassName 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 0x0000000800df8000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 0x0000000800de4400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 0x0000000800de4000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4000 So we do not need to change this. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:07:26 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:07:26 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v7] In-Reply-To: References: Message-ID: <_lRe5-6b9-D-LXmA4sRXN1spqBJrsMpHxCCKwkD6hzA=.669540b8-b6d2-447c-b216-3281791d6b8f@github.com> > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge branch 'master' into jcmd_classes - typo - fix - fix test - -verbose and help doc - -verbose - review - 8275775 Add VM.classes to print details of all classes ------------- Changes: https://git.openjdk.java.net/jdk/pull/7105/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=06 Stats: 172 lines in 6 files changed: 171 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:07:27 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:07:27 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Wed, 19 Jan 2022 02:50:16 GMT, Chris Plummer wrote: >>> It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. >> >> This is a good idea, but `jcmd VM.classes verbose=XX` looks strange, `jcmd VM.class XX` is also not much proper, because we desire to print all classes in default(`jcmd VM.class`). an alternative is to use `jcmd VM.classes verbose | grep XX` currently. > >> > It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. >> >> This is a good idea, but `jcmd VM.classes verbose=XX` looks strange, `jcmd VM.class XX` is also not much proper, because we desire to print all classes in default(`jcmd VM.class`). an alternative is to use `jcmd VM.classes verbose | grep XX` currently. > > I was thinking the syntax would look like: `jcmd VM.classes [verbose [classname]]` > > Your grep solution doesn't work because each class has multiple lines of output. @plummercj Can you please help review this from the serviceability point of view? Thanks in advance! ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:07:29 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:07:29 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 11 Feb 2022 06:53:23 GMT, David Holmes wrote: >> Yi Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> fix > > src/hotspot/share/services/diagnosticCommand.cpp line 964: > >> 962: "Dump the detail content of Java class. " >> 963: "Some classes are annotated with flags: " >> 964: "F = has finializer method, " > > typo finializer - but should be finalize > > Is this actually only present for "non-trivial finalize" method? I'm not sure what's the meaning of "non-trivial finalize" method, can you elaborate more for it? (P.S. All comments are addressed) ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Fri Mar 4 03:35:01 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 4 Mar 2022 03:35:01 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v7] In-Reply-To: <_lRe5-6b9-D-LXmA4sRXN1spqBJrsMpHxCCKwkD6hzA=.669540b8-b6d2-447c-b216-3281791d6b8f@github.com> References: <_lRe5-6b9-D-LXmA4sRXN1spqBJrsMpHxCCKwkD6hzA=.669540b8-b6d2-447c-b216-3281791d6b8f@github.com> Message-ID: On Fri, 4 Mar 2022 03:07:26 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'master' into jcmd_classes > - typo > - fix > - fix test > - -verbose and help doc > - -verbose > - review > - 8275775 Add VM.classes to print details of all classes Changes requested by dholmes (Reviewer). src/hotspot/share/services/diagnosticCommand.cpp line 964: > 962: "Dump the detailed content of a Java class. " > 963: "Some classes are annotated with flags: " > 964: "F = has finialize method, " This is still spelt incorrectly: finalize ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:40:36 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:40:36 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v8] In-Reply-To: References: Message-ID: <6LZNPSXgScyoJ3Igvj4ZzdLMiezx4-6-LnMK-rIt3fE=.a4817835-9927-4b98-8218-dd2d0d788b5a@github.com> > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with one additional commit since the last revision: typo ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/aab2c333..26b4d124 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:40:40 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:40:40 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v7] In-Reply-To: References: <_lRe5-6b9-D-LXmA4sRXN1spqBJrsMpHxCCKwkD6hzA=.669540b8-b6d2-447c-b216-3281791d6b8f@github.com> Message-ID: On Fri, 4 Mar 2022 03:29:51 GMT, David Holmes wrote: >> Yi Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: >> >> - Merge branch 'master' into jcmd_classes >> - typo >> - fix >> - fix test >> - -verbose and help doc >> - -verbose >> - review >> - 8275775 Add VM.classes to print details of all classes > > src/hotspot/share/services/diagnosticCommand.cpp line 964: > >> 962: "Dump the detailed content of a Java class. " >> 963: "Some classes are annotated with flags: " >> 964: "F = has finialize method, " > > This is still spelt incorrectly: finalize Sorry... changed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Fri Mar 4 03:47:07 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 4 Mar 2022 03:47:07 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 03:03:10 GMT, Yi Yang wrote: >> src/hotspot/share/services/diagnosticCommand.cpp line 964: >> >>> 962: "Dump the detail content of Java class. " >>> 963: "Some classes are annotated with flags: " >>> 964: "F = has finializer method, " >> >> typo finializer - but should be finalize >> >> Is this actually only present for "non-trivial finalize" method? > > I'm not sure what's the meaning of "non-trivial finalize" method, can you elaborate more for it? > (P.S. All comments are addressed) I mean a finalize() method that actually does something. I checked the code and you will print F is the current class has a non-empty finalize() method, or it has a superclass with a non-empty finalize method. I would suggest updating the text to: `F = has, or inherits, a non-empty finalize method` Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:53:52 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:53:52 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v9] In-Reply-To: References: Message-ID: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with one additional commit since the last revision: finalize desc change ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/26b4d124..ba399fb5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 03:53:53 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 03:53:53 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 03:43:59 GMT, David Holmes wrote: >> I'm not sure what's the meaning of "non-trivial finalize" method, can you elaborate more for it? >> (P.S. All comments are addressed) > > I mean a finalize() method that actually does something. I checked the code and you will print F is the current class has a non-empty finalize() method, or it has a superclass with a non-empty finalize method. I would suggest updating the text to: > > `F = has, or inherits, a non-empty finalize method` > > Thanks, > David Done. This description is more clear compared to non-trivial finalize. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Fri Mar 4 05:09:59 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 4 Mar 2022 05:09:59 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v9] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 03:53:52 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > finalize desc change Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From iklam at openjdk.java.net Fri Mar 4 05:17:05 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 4 Mar 2022 05:17:05 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 02:47:28 GMT, Yi Yang wrote: >> This issue seem still outstanding. > > Current: > > $./jcmd 83908 VM.classes|head -10 > 83908: > KlassAddr Size State Flags ClassName > 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 > 0x0000000800df8000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 > 0x0000000800de4400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 > 0x0000000800de4000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4000 > 0x0000000800dc8800 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800dc8800 > 0x0000000800dc8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800dc8400 > 0x0000000800dc8000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800dc8000 > 0x0000000800db9800 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800db9800 > > After using "%4d": > > $./jcmd 75481 VM.classes|head > 75481: > KlassAddr Size State Flags ClassName > 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 > 0x0000000800df8000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 > 0x0000000800de4400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 > 0x0000000800de4000 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4000 > > So we do not need to change this. You should change it to `%4d`. Otherwise, when the numbers are changed in the future (e.g., to 3 or 4 digits) they will be misaligned: KlassAddr Size State Flags ClassName 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 0x0000000800df8000 123 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 0x0000000800de4400 4567 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From darcy at openjdk.java.net Fri Mar 4 06:10:09 2022 From: darcy at openjdk.java.net (Joe Darcy) Date: Fri, 4 Mar 2022 06:10:09 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: References: Message-ID: On Tue, 1 Mar 2022 06:17:06 GMT, Joe Darcy wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Adding descriptive comments. > > test/jdk/java/lang/Math/RoundTests.java line 32: > >> 30: public static void main(String... args) { >> 31: int failures = 0; >> 32: for (int i = 0; i < 100000; i++) { > > Is there an idiom to trigger the auto-vectorization, perhaps using command line arguments, that doesn't bloat the running time of this test? IMO RoundTests should have a explicit @run tag without any VM options as well. Do the added VM options run on all platforms in question? What is the approximate time to run the test run compared to before? ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From lzang at openjdk.java.net Fri Mar 4 07:15:05 2022 From: lzang at openjdk.java.net (Lin Zang) Date: Fri, 4 Mar 2022 07:15:05 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v9] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 03:53:52 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > finalize desc change Sorry that I just chime in. It seems this change adds new command options, so it seems that `csr` is required? ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 07:23:05 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 07:23:05 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v9] In-Reply-To: References: Message-ID: <1RjDE6l30UfBicSZglJxMSECo3R9L1Rr5uScSlDAj00=.70bcfd73-da06-4883-915d-23873ea9a6de@github.com> On Fri, 4 Mar 2022 07:12:03 GMT, Lin Zang wrote: > Sorry that I just chime in. It seems this change adds new command options, so it seems that `csr` is required? Hi @linzang, according to [previous discussion](https://github.com/openjdk/jdk/pull/6075) and [comments in JBS](https://bugs.openjdk.java.net/browse/JDK-8275775), it's not necessary to create a csr for it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 07:43:00 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 07:43:00 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 05:13:58 GMT, Ioi Lam wrote: > You should change it to `%4d`. Otherwise, when the numbers are changed in the future (e.g., to 3 or 4 digits) they will be misaligned: > > ``` > KlassAddr Size State Flags ClassName > 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 > 0x0000000800df8000 123 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 > 0x0000000800de4400 4567 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 > ``` This format looks pretty good to me, they are all aligned to left. If you still think it's more proper to have a format like this: KlassAddr Size State Flags ClassName 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 0x0000000800df8000 123 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 0x0000000800de4400 4567 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 Then I'm glad to do so ;) ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From lzang at openjdk.java.net Fri Mar 4 07:42:59 2022 From: lzang at openjdk.java.net (Lin Zang) Date: Fri, 4 Mar 2022 07:42:59 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v9] In-Reply-To: <1RjDE6l30UfBicSZglJxMSECo3R9L1Rr5uScSlDAj00=.70bcfd73-da06-4883-915d-23873ea9a6de@github.com> References: <1RjDE6l30UfBicSZglJxMSECo3R9L1Rr5uScSlDAj00=.70bcfd73-da06-4883-915d-23873ea9a6de@github.com> Message-ID: On Fri, 4 Mar 2022 07:20:14 GMT, Yi Yang wrote: > > Sorry that I just chime in. It seems this change adds new command options, so it seems that `csr` is required? > > Hi @linzang, according to [previous discussion](https://github.com/openjdk/jdk/pull/6075) and [comments in JBS](https://bugs.openjdk.java.net/browse/JDK-8275775), it's not necessary to create a csr for it. Ah, I missed that. thanks for point it out! ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From iklam at openjdk.java.net Fri Mar 4 08:18:03 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 4 Mar 2022 08:18:03 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 07:24:51 GMT, Yi Yang wrote: >> You should change it to `%4d`. Otherwise, when the numbers are changed in the future (e.g., to 3 or 4 digits) they will be misaligned: >> >> >> KlassAddr Size State Flags ClassName >> 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 >> 0x0000000800df8000 123 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 >> 0x0000000800de4400 4567 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 > >> You should change it to `%4d`. Otherwise, when the numbers are changed in the future (e.g., to 3 or 4 digits) they will be misaligned: >> >> ``` >> KlassAddr Size State Flags ClassName >> 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 >> 0x0000000800df8000 123 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 >> 0x0000000800de4400 4567 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 >> ``` > > This format looks pretty good to me, they are all aligned to left. If you still think it's more proper to have a format like this: > > KlassAddr Size State Flags ClassName > 0x0000000800df8400 62 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8400 > 0x0000000800df8000 123 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800df8000 > 0x0000000800de4400 4567 fully_initialized W java.lang.invoke.LambdaForm$DMH/0x0000000800de4400 > > Then I'm glad to do so ;) Numbers should be aligned to the right. The following is what I want: 62 123 4567 ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 09:05:37 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 09:05:37 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 08:14:39 GMT, Ioi Lam wrote: > Numbers should be aligned to the right. The following is what I want: > > ``` > 62 > 123 > 4567 > ``` Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Fri Mar 4 09:05:36 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 4 Mar 2022 09:05:36 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: References: Message-ID: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with one additional commit since the last revision: use %4d ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/ba399fb5..a07d4bf8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From kbarrett at openjdk.java.net Fri Mar 4 13:57:28 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 4 Mar 2022 13:57:28 GMT Subject: RFR: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description Message-ID: Please review this change to provide a link to the Wikipedia description of One-True-Brace-Style. As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot Group Members, though comments from others are welcome. ------------- Commit messages: - update html - add OTBS link Changes: https://git.openjdk.java.net/jdk/pull/7692/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7692&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252577 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7692.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7692/head:pull/7692 PR: https://git.openjdk.java.net/jdk/pull/7692 From kbarrett at openjdk.java.net Fri Mar 4 14:11:48 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 4 Mar 2022 14:11:48 GMT Subject: RFR: 8257589: HotSpot Style Guide should link to rfc7282 Message-ID: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> Please review this change to the link for the definition of "rough consensus". The current link is to a Wikipedia article that references rfc7282. We should instead link directly the the RFC. This change was requested during the review of JDK-8247976, but not made at that time. (I'm not sure whether it was intentionally deferred or missed/forgotten.) As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot Group Members, though comments from others are welcome. ------------- Commit messages: - update html - update rough consensus link Changes: https://git.openjdk.java.net/jdk/pull/7693/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7693&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8257589 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7693.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7693/head:pull/7693 PR: https://git.openjdk.java.net/jdk/pull/7693 From kbarrett at openjdk.java.net Fri Mar 4 14:18:27 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 4 Mar 2022 14:18:27 GMT Subject: RFR: 8272691: Fix HotSpot style guide terminology for "non-local variables" Message-ID: Please review this fix to incorrect terminology used in one place. The correct terminology (per C++14 3.6.2) is "non-local variables". As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot Group Members, though comments from others are welcome. ------------- Commit messages: - update html - fix non-local variable term Changes: https://git.openjdk.java.net/jdk/pull/7695/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7695&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8272691 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7695.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7695/head:pull/7695 PR: https://git.openjdk.java.net/jdk/pull/7695 From kbarrett at openjdk.java.net Fri Mar 4 15:12:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 4 Mar 2022 15:12:37 GMT Subject: RFR: 8263134: HotSpot Style Guide should disallow inheriting constructors Message-ID: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Please review this change to explicitly disallow the use of inheriting constructors: (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2540.htm). The C++11/14 specification has a lot of problems. These were addressed in C++17 (and as a DR that affects C++11/14): (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0136r1.html). Use of inheriting constructors now runs the risk of encountering those bugs, inconsistent behavior between different compilers or compiler versions, and behavior changes for future support of C++17. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - update html - forbid inherited ctors Changes: https://git.openjdk.java.net/jdk/pull/7698/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7698&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263134 Stats: 29 lines in 2 files changed: 29 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7698.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7698/head:pull/7698 PR: https://git.openjdk.java.net/jdk/pull/7698 From stuefe at openjdk.java.net Fri Mar 4 16:54:06 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 4 Mar 2022 16:54:06 GMT Subject: RFR: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 13:51:03 GMT, Kim Barrett wrote: > Please review this change to provide a link to the Wikipedia description of > One-True-Brace-Style. > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. LGTM. I did not know that this style had a name :) ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7692 From duke at openjdk.java.net Fri Mar 4 16:56:25 2022 From: duke at openjdk.java.net (Matteo Baccan) Date: Fri, 4 Mar 2022 16:56:25 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines Message-ID: Hi I have reviewed the code for removing double semicolons at the end of lines all the best matteo ------------- Commit messages: - Removed double semicolon at the end of lines Changes: https://git.openjdk.java.net/jdk/pull/7268/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7268&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282657 Stats: 93 lines in 82 files changed: 0 ins; 0 del; 93 mod Patch: https://git.openjdk.java.net/jdk/pull/7268.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7268/head:pull/7268 PR: https://git.openjdk.java.net/jdk/pull/7268 From duke at openjdk.java.net Fri Mar 4 16:56:25 2022 From: duke at openjdk.java.net (Matteo Baccan) Date: Fri, 4 Mar 2022 16:56:25 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Hi I have pushed this PR about 1 month ago. Only 3 days ago OCA was accepted. Now: what is the next step? This is a cleanup PR that removes some double semicolons at the end of some lines inside the JDK code. ciao matteo ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From jwaters at openjdk.java.net Fri Mar 4 16:56:26 2022 From: jwaters at openjdk.java.net (Julian Waters) Date: Fri, 4 Mar 2022 16:56:26 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 25 Feb 2022 15:40:09 GMT, Matteo Baccan wrote: >> Hi >> >> I have reviewed the code for removing double semicolons at the end of lines >> >> all the best >> matteo > > Hi > > I have pushed this PR about 1 month ago. Only 3 days ago OCA was accepted. > Now: what is the next step? > > This is a cleanup PR that removes some double semicolons at the end of some lines inside the JDK code. > > ciao > matteo Hi @matteobaccan The next step would be to create a relevant issue on the tracker and set it to track this PR. Since you don't have the ability to create new issues on the JBS yet I'll help you create one, please rename your PR title to 8282657 and the system should take care of the rest and automatically mark your PR as ready for review after setting it to track the corresponding JBS entry. Cheers, Julian ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From lancea at openjdk.java.net Fri Mar 4 17:07:59 2022 From: lancea at openjdk.java.net (Lance Andersen) Date: Fri, 4 Mar 2022 17:07:59 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo The changes look OK. The copyright year probably should be updated as part of this PR ------------- Marked as reviewed by lancea (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7268 From amenkov at openjdk.java.net Fri Mar 4 17:12:51 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Fri, 4 Mar 2022 17:12:51 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes [v2] In-Reply-To: References: Message-ID: > JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. > We still have some incorrect code in CP merge during class redefinition. > > rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. > > In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. > > tested: > - test/jdk/java/lang/instrument > - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: Reworked the test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7676/files - new: https://git.openjdk.java.net/jdk/pull/7676/files/8c6d55c5..c51b82d7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7676&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7676&range=00-01 Stats: 429 lines in 2 files changed: 196 ins; 233 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7676.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7676/head:pull/7676 PR: https://git.openjdk.java.net/jdk/pull/7676 From amenkov at openjdk.java.net Fri Mar 4 17:12:52 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Fri, 4 Mar 2022 17:12:52 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes [v2] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 22:51:18 GMT, Coleen Phillimore wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> Reworked the test > > test/jdk/java/lang/instrument/RetransformGenericSignatureTest.java line 1: > >> 1: /* > > Can you write this test in the framework where the newer RedefineClasses test are in test/hotspot/jtreg/serviceability/jvmti/RedefineClasses ? You can just write the new class as a string that the inMemory compiler compiles for you. It's a lot simpler and doesn't use a shell script at all. Reworked the test: - used in-memory compirer and asm to prepare new version of the class; - used redefineClass instead of retransformClasses (ClassFileTransformer is not needed anymore); - used RedefineClassHelper agent instead of the agent from ATransformerManagementTestCase framework; - moved the test to test/hotspot/jtreg/serviceability/jvmti/RedefineClasses as actually it tests JVMTI functionality. ------------- PR: https://git.openjdk.java.net/jdk/pull/7676 From duke at openjdk.java.net Fri Mar 4 17:14:05 2022 From: duke at openjdk.java.net (Matteo Baccan) Date: Fri, 4 Mar 2022 17:14:05 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Hi Lance I can make a second commit updating the copyright year Tell me if this is necessary ciao matteo ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From rriggs at openjdk.java.net Fri Mar 4 17:20:05 2022 From: rriggs at openjdk.java.net (Roger Riggs) Date: Fri, 4 Mar 2022 17:20:05 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo We usually request that these be be broken up by area to attract the appropriate reviewers and avoid eye-strain. The client modules are usually separated out, as are hotspot. And corresponding tests. This kind of change is pretty low value for the code base and requires the involvement of quite a few reviewers. If you had ask ahead of time, I would have suggested finding something with more value. ------------- Changes requested by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7268 From dcubed at openjdk.java.net Fri Mar 4 17:31:11 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 4 Mar 2022 17:31:11 GMT Subject: RFR: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 13:51:03 GMT, Kim Barrett wrote: > Please review this change to provide a link to the Wikipedia description of > One-True-Brace-Style. > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7692 From dcubed at openjdk.java.net Fri Mar 4 17:33:11 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 4 Mar 2022 17:33:11 GMT Subject: RFR: 8257589: HotSpot Style Guide should link to rfc7282 In-Reply-To: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> References: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> Message-ID: On Fri, 4 Mar 2022 14:04:10 GMT, Kim Barrett wrote: > Please review this change to the link for the definition of "rough consensus". > The current link is to a Wikipedia article that references rfc7282. We should > instead link directly the the RFC. This change was requested during the > review of JDK-8247976, but not made at that time. (I'm not sure whether it was > intentionally deferred or missed/forgotten.) > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7693 From dcubed at openjdk.java.net Fri Mar 4 17:34:06 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 4 Mar 2022 17:34:06 GMT Subject: RFR: 8272691: Fix HotSpot style guide terminology for "non-local variables" In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 14:13:24 GMT, Kim Barrett wrote: > Please review this fix to incorrect terminology used in one place. The > correct terminology (per C++14 3.6.2) is "non-local variables". > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7695 From ihse at openjdk.java.net Fri Mar 4 17:50:03 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Fri, 4 Mar 2022 17:50:03 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 17:17:12 GMT, Roger Riggs wrote: >> Hi >> >> I have reviewed the code for removing double semicolons at the end of lines >> >> all the best >> matteo > > We usually request that these be be broken up by area to attract the appropriate reviewers and avoid eye-strain. The client modules are usually separated out, as are hotspot. > And corresponding tests. > This kind of change is pretty low value for the code base and requires the involvement of quite a few reviewers. > If you had ask ahead of time, I would have suggested finding something with more value. @RogerRiggs Otoh, this change is really trivial. I have verified that all changes are replacing trailing ";;" in Java code. These are all clearly typos. I think it's nice that we try to strive for a high quality, and while you are correct this is maybe not the most pressing issue, I think it's nice to get a cleanup like this done. I'd argue that this is a trivial change. If you are worried, let's get a couple of more reviewers. I can't see the need to split this up per area. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From coleenp at openjdk.java.net Fri Mar 4 17:56:06 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 4 Mar 2022 17:56:06 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes [v2] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 17:12:51 GMT, Alex Menkov wrote: >> JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. >> We still have some incorrect code in CP merge during class redefinition. >> >> rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. >> >> In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. >> >> tested: >> - test/jdk/java/lang/instrument >> - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Reworked the test Thanks, the test is much more understandable now. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7676 From duke at openjdk.java.net Fri Mar 4 18:20:13 2022 From: duke at openjdk.java.net (Danila Malyutin) Date: Fri, 4 Mar 2022 18:20:13 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes [v5] In-Reply-To: References: Message-ID: On Wed, 5 May 2021 12:00:21 GMT, Aleksey Shipilev wrote: >> This reworks the compiler support for blackholes. The key difference against the last version (#1203) is that blackholes are only acceptable as empty static methods, which both simplifies the implementation and eliminates a few compatibility questions. >> >> JMH uses the `Blackhole::consume` methods to avoid dead-code elimination of the code that produces benchmark values. It now relies on producing opaque side-effects and breaking inlining. While it was proved useful for many years, it unfortunately comes with several major drawbacks. >> >> Instead of introducing public APIs or special-casing JMH methods in JVM, we can hook a new command to compiler control, and let JMH sign up its `Blackhole` methods for it with `-XX:CompileCommand=blackhole,org.openjdk.jmh.infra.Blackhole::consume`. See CSR and related discussion for alternatives and future plans. >> >> C1 code is platform-independent, and it handles blackhole via the intrinsics paths, lowering it to nothing. >> >> C2 makes the `Blackhole` the subclass of `MemBar`, and use the same `Matcher` path as `Op_MemCPUOrder`: it does not match to anything, but it survives until matching, and keeps arguments alive. Additionally, C1 and C2 hooks are now using the synthetic `_blackhole` intrinsic, similarly to existing `_compiledLambdaForm`. It avoids introducing new nodes in C1. It also seems to require the least fiddling with C2 internals. > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: > > - Merge branch 'master' into JDK-8259316-blackholes-redo > - Remove @build in favor of @library > - Update copyrights where needed > - Add vm.flagless requirement to tests > - Stray whitespace > - Merge branch 'master' into JDK-8259316-blackholes-redo > - Merge branch 'master' into JDK-8259316-blackholes-redo > - Redo BlackholeIntrinsicTest to see if target blackhole methods were indeed intrinsified > - Rename BlackholeStaticTest to BlackholeIntrinsicTest > - BlackholeStaticTest should unlock blackholes > - ... and 17 more: https://git.openjdk.java.net/jdk/compare/65ce4d20...65e865aa src/hotspot/share/ci/ciMethod.cpp line 158: > 156: #endif > 157: > 158: CompilerOracle::tag_blackhole_if_possible(h_m); Is there a reason for why not updating `_intrinsic_id = h_m->intrinsic_id();` with the potentially updated id after calling this is fine? Wouldn't it still contain the old id set here (at least for the first instance): https://github.com/openjdk/jdk/pull/2024/files#diff-c08baec11bce860df907b1edd944889e3778519fdd7c7a5b9d00e10eb267667fR84 ? ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From kbarrett at openjdk.java.net Fri Mar 4 18:45:21 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 4 Mar 2022 18:45:21 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions Message-ID: Please review this change to permit the use of "unrestricted unions" (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot code. This permits any non-reference type to be used as a union data member, as well as permitting static data members in named unions. There are various classes in HotSpot that might be able to take advantage of this new feature. An example is the aarch64-specific Address class. It presently contains a collection of data members. For any given instance, only some of these data members are initialized and used. The `_mode` member indicates which. So it's effectively a kind of discriminated union with the data unpacked and not overlapping, with `_mode` being the discrimenant. A consequence of the current implementation is that some compilers may generate warnings under some circumstances because of uninitialized data members. (I ran into this problem with gcc when making an otherwise unrelated change to one of the member types.) This Address class could be made smaller (so cheaper to copy, which happens often as Address objects are frequently passed by value) and usage made clearer, by making it an actual union. But that isn't possible with the C++03 restrictions. Another example is the RelocationHolder class, which is effectively a union over the various concrete Relocation types, but implemented in a way that has some issues (JDK-8160404). Testing: I've tried some examples without running into any problems. This included some experiments with RelocationHolder for JDK-8160404. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - update html - unrestricted unions Changes: https://git.openjdk.java.net/jdk/pull/7704/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7704&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282668 Stats: 4 lines in 2 files changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7704.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7704/head:pull/7704 PR: https://git.openjdk.java.net/jdk/pull/7704 From jbhateja at openjdk.java.net Fri Mar 4 19:08:09 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 4 Mar 2022 19:08:09 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: References: Message-ID: <2jFjnftd7VluGsxgp8BK0vgHA68VrgGREj0fk7F6Dhk=.e40ddcaa-5a31-4115-976d-5f43e94b8ccf@github.com> On Fri, 4 Mar 2022 06:06:52 GMT, Joe Darcy wrote: >> test/jdk/java/lang/Math/RoundTests.java line 32: >> >>> 30: public static void main(String... args) { >>> 31: int failures = 0; >>> 32: for (int i = 0; i < 100000; i++) { >> >> Is there an idiom to trigger the auto-vectorization, perhaps using command line arguments, that doesn't bloat the running time of this test? > > IMO RoundTests should have a explicit @run tag without any VM options as well. > > Do the added VM options run on all platforms in question? What is the approximate time to run the test run compared to before? Hi @jddarcy , Test has been modified on the same lines using generic options which manipulate compilation thresholds and agnostic to target platforms. * @run main/othervm -XX:Tier3CompileThreshold=100 -XX:CompileThresholdScaling=0.01 -XX:+TieredCompilation RoundTests Verified that RoundTests::test* methods gets compiled by c2. Test execution time with and without change is almost same ~7.80sec over Skylake-server. Regards ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From prr at openjdk.java.net Fri Mar 4 19:10:06 2022 From: prr at openjdk.java.net (Phil Race) Date: Fri, 4 Mar 2022 19:10:06 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Marked as reviewed by prr (Reviewer). Looks like there's only one client source code file touched Most of the client changes are only in jtreg tests - and this is very trivial, so I'm OK with them being included here. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From rriggs at openjdk.java.net Fri Mar 4 19:33:06 2022 From: rriggs at openjdk.java.net (Roger Riggs) Date: Fri, 4 Mar 2022 19:33:06 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Marked as reviewed by rriggs (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From iris at openjdk.java.net Fri Mar 4 19:43:03 2022 From: iris at openjdk.java.net (Iris Clark) Date: Fri, 4 Mar 2022 19:43:03 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Nice tidy of the code. Is there anything that can be done to prevent re-introduction of this trivial problem? Perhaps a new Skara bot jcheck option similar to what is already in place for trailing whitespace? ------------- Marked as reviewed by iris (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7268 From wetmore at openjdk.java.net Fri Mar 4 19:56:03 2022 From: wetmore at openjdk.java.net (Bradford Wetmore) Date: Fri, 4 Mar 2022 19:56:03 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: <1X39o4ON1uvbSXAp_r66zAmSy6sWZFKaP7-M54vAqX0=.d6abe0d5-9dd2-409b-91df-255d838196cb@github.com> On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo LGTM also. Similar suggestion for updating copyrights. ------------- Marked as reviewed by wetmore (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7268 From iklam at openjdk.java.net Fri Mar 4 20:15:06 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 4 Mar 2022 20:15:06 GMT Subject: RFR: 8281181: Do not use CPU Shares to compute active processor count [v2] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 12:50:30 GMT, David Holmes wrote: > Updates look good. > > Please file an issue to obsolete these flags in JDK 20 (they will be expired en-masse in JDK 21). > > Thanks, David I filed https://bugs.openjdk.java.net/browse/JDK-8282684 Thanks Thanks to @dholmes-ora and @jerboaa for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7666 From iklam at openjdk.java.net Fri Mar 4 20:18:03 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 4 Mar 2022 20:18:03 GMT Subject: Integrated: 8281181: Do not use CPU Shares to compute active processor count In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 20:01:46 GMT, Ioi Lam wrote: > This is a simple change (Linux-only) that removes the consideration of Cgroups CPU Shares from the active process count calculation. Of note, this fixes CPU underutilization when Java is executed by Kubernetes without CPU resources limits. > > Please see the CSR [JDK-8281571](https://bugs.openjdk.java.net/browse/JDK-8281571) for a detailed discussion of the reasons to make this change. > > To err on the side of caution, we added a temporary (and deprecated) VM flag `-XX:+UseContainerCpuShares` to enable the old behavior. We believe the old behavior is wrong and unnecessary. The plan is to remove the old behavior in JDK 20. > > The associated flag, `PreferContainerQuotaForCPUCount` is also deprecated. Both flags will be obsoleted in JDK 20. > > Testing with tiers 1-4, as well as container tests in tier5. This pull request has now been integrated. Changeset: e07fd395 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/e07fd395bdc314867886a621ec76cf74a5f76b89 Stats: 52 lines in 6 files changed: 34 ins; 0 del; 18 mod 8281181: Do not use CPU Shares to compute active processor count Reviewed-by: dholmes, sgehwolf ------------- PR: https://git.openjdk.java.net/jdk/pull/7666 From darcy at openjdk.java.net Fri Mar 4 21:33:07 2022 From: darcy at openjdk.java.net (Joe Darcy) Date: Fri, 4 Mar 2022 21:33:07 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Marked as reviewed by darcy (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From dholmes at openjdk.java.net Sat Mar 5 05:52:04 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Sat, 5 Mar 2022 05:52:04 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: <2NJw-71OOOvNs9519H__uYdXQnJm23L-Ez4jKoAuKrk=.c277d644-fd63-442e-99a1-6d3d66cb3405@github.com> On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo I eyeballed the diff file and all seems okay. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7268 From jwaters at openjdk.java.net Sat Mar 5 06:52:13 2022 From: jwaters at openjdk.java.net (Julian Waters) Date: Sat, 5 Mar 2022 06:52:13 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo Nice, good work matteo Should I change the JBS issue title to match the PR title, or is it preferred for the PR title to change? ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From duke at openjdk.java.net Sat Mar 5 07:22:03 2022 From: duke at openjdk.java.net (Jan =?UTF-8?B?U2NobMO2w59pbg==?=) Date: Sat, 5 Mar 2022 07:22:03 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo This PR changes a comment in javax/swing/RepaintManager.java. This commented out code should be removed altogether in another PR? Because its an System.out.println and because its commented out code. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From iklam at openjdk.java.net Sat Mar 5 08:12:07 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 5 Mar 2022 08:12:07 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Sat Mar 5 09:12:00 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Sat, 5 Mar 2022 09:12:00 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From stuefe at openjdk.java.net Sat Mar 5 15:46:59 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 5 Mar 2022 15:46:59 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d Looks good! ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7105 From aph at openjdk.java.net Sun Mar 6 09:35:08 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Sun, 6 Mar 2022 09:35:08 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v11] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 02:44:41 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Removing +LogCompilation flag. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4157: > 4155: ExternalAddress mxcsr_std(StubRoutines::x86::addr_mxcsr_std()); > 4156: ldmxcsr(new_mxcsr); > 4157: movl(scratch, 1056964608); What is 1056964608 ? ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Sun Mar 6 11:47:01 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 6 Mar 2022 11:47:01 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v11] In-Reply-To: References: Message-ID: <78mhhL5dqkY5LQY2U2i_DF6MvgYBQiDIrieGcBOCGAA=.ec16edf8-24d2-4658-92ff-5d20f03e7620@github.com> On Sun, 6 Mar 2022 09:31:27 GMT, Andrew Haley wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Removing +LogCompilation flag. > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4157: > >> 4155: ExternalAddress mxcsr_std(StubRoutines::x86::addr_mxcsr_std()); >> 4156: ldmxcsr(new_mxcsr); >> 4157: movl(scratch, 1056964608); > > What is 1056964608 ? Raw bits corresponding to floating point value 0.5f. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From aph at openjdk.java.net Sun Mar 6 13:52:00 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Sun, 6 Mar 2022 13:52:00 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v11] In-Reply-To: References: Message-ID: On Wed, 2 Mar 2022 02:44:41 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Removing +LogCompilation flag. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4157: > 4155: ExternalAddress mxcsr_std(StubRoutines::x86::addr_mxcsr_std()); > 4156: ldmxcsr(new_mxcsr); > 4157: movl(scratch, 1056964608); Suggestion: movl(scratch, 1056964608); // Raw bits corresponding to floating point value 0.5f. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dholmes at openjdk.java.net Sun Mar 6 23:23:56 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 6 Mar 2022 23:23:56 GMT Subject: RFR: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description In-Reply-To: References: Message-ID: <4WBkBHC9QQc6Yvoc2WBBdhma5D7TtOpGX8VmSe78B-A=.4abd2336-f21e-4b8b-ac1b-4de05fc50679@github.com> On Fri, 4 Mar 2022 13:51:03 GMT, Kim Barrett wrote: > Please review this change to provide a link to the Wikipedia description of > One-True-Brace-Style. > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7692 From dholmes at openjdk.java.net Sun Mar 6 23:32:00 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 6 Mar 2022 23:32:00 GMT Subject: RFR: 8257589: HotSpot Style Guide should link to rfc7282 In-Reply-To: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> References: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> Message-ID: On Fri, 4 Mar 2022 14:04:10 GMT, Kim Barrett wrote: > Please review this change to the link for the definition of "rough consensus". > The current link is to a Wikipedia article that references rfc7282. We should > instead link directly the the RFC. This change was requested during the > review of JDK-8247976, but not made at that time. (I'm not sure whether it was > intentionally deferred or missed/forgotten.) > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Glad I don't have to figure out when consensus exists! :) ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7693 From dholmes at openjdk.java.net Mon Mar 7 00:33:00 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 00:33:00 GMT Subject: RFR: 8263134: HotSpot Style Guide should disallow inheriting constructors In-Reply-To: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> References: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Message-ID: On Fri, 4 Mar 2022 15:04:47 GMT, Kim Barrett wrote: > Please review this change to explicitly disallow the use of inheriting > constructors: > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2540.htm). > > The C++11/14 specification has a lot of problems. These were addressed in > C++17 (and as a DR that affects C++11/14): > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0136r1.html). > > Use of inheriting constructors now runs the risk of encountering those bugs, > inconsistent behavior between different compilers or compiler versions, and > behavior changes for future support of C++17. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This seems reasonable as it is no real hardship to explicitly declare all constructors. Personally I think it is clearer as well otherwise you have to go and lookup the base class to see what the constructor semantics are. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7698 From dholmes at openjdk.java.net Mon Mar 7 01:22:01 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 01:22:01 GMT Subject: RFR: 8272691: Fix HotSpot style guide terminology for "non-local variables" In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 14:13:24 GMT, Kim Barrett wrote: > Please review this fix to incorrect terminology used in one place. The > correct terminology (per C++14 3.6.2) is "non-local variables". > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Looks good. Thanks, ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7695 From yyang at openjdk.java.net Mon Mar 7 02:33:59 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 7 Mar 2022 02:33:59 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: References: Message-ID: <3cq6YS0jTqk5fJGV-71WOZNGgyMUSdia-_W-WWcF5a8=.e6f75582-2957-4154-972f-dad787fc8041@github.com> On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d Thank you for taking time to help review. 3 approval, I want to merge this tomorrow if no objections/comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From ecki at zusammenkunft.net Mon Mar 7 02:40:01 2022 From: ecki at zusammenkunft.net (Bernd Eckenfels) Date: Mon, 7 Mar 2022 02:40:01 +0000 Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: <3cq6YS0jTqk5fJGV-71WOZNGgyMUSdia-_W-WWcF5a8=.e6f75582-2957-4154-972f-dad787fc8041@github.com> References: <3cq6YS0jTqk5fJGV-71WOZNGgyMUSdia-_W-WWcF5a8=.e6f75582-2957-4154-972f-dad787fc8041@github.com> Message-ID: Hello, I would add an additional argument to allow substring filtering on the fully qualified class name (like com/example or UtilClass), since this can creator reduce processing/printing time. But I guess that can be added as an additional feature later on (maybe only the ?verbose? variant can conflict if this is not a option (-verbose?). Gruss Bernd -- http://bernd.eckenfels.net ________________________________ Von: serviceability-dev im Auftrag von Yi Yang Gesendet: Monday, March 7, 2022 3:33:59 AM An: hotspot-dev at openjdk.java.net ; serviceability-dev at openjdk.java.net Betreff: Re: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d Thank you for taking time to help review. 3 approval, I want to merge this tomorrow if no objections/comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Mon Mar 7 02:58:59 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 7 Mar 2022 02:58:59 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d > _Mailing list message from [Bernd Eckenfels](mailto:ecki at zusammenkunft.net) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_ > > Hello, > > I would add an additional argument to allow substring filtering on the fully qualified class name (like com/example or UtilClass), since this can creator reduce processing/printing time. But I guess that can be added as an additional feature later on (maybe only the ?verbose? variant can conflict if this is not a option (-verbose?). > > Gruss Bernd -- http://bernd.eckenfels.net ________________________________ Von: serviceability-dev im Auftrag von Yi Yang Gesendet: Monday, March 7, 2022 3:33:59 AM An: hotspot-dev at openjdk.java.net ; serviceability-dev at openjdk.java.net Betreff: Re: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v10] > > On Fri, 4 Mar 2022 09:05:36 GMT, Yi Yang wrote: > > > > Add VM.classes to print details of all classes, output looks like: > > > 1. jcmd VM.classes > > > KlassAddr Size State Flags LoaderName ClassName > > > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > > > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > > > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > > > ... > > > 2. jcmd VM.classes verbose > > > KlassAddr Size State Flags LoaderName ClassName > > > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > > > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > > > - instance size: 2 > > > - klass size: 62 > > > - access: final synchronized > > > - state: inited > > > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > > > - super: 'java/lang/Object' > > > - sub: > > > - arrays: NULL > > > - methods: Array(0x00007f620841f210) > > > - method ordering: Array(0x0000000800a7e5a8) > > > - default_methods: Array(0x0000000000000000) > > > - local interfaces: Array(0x00000008005af748) > > > - trans. interfaces: Array(0x00000008005af748) > > > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > > > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > > > - source file: 'LambdaForm$MH' > > > - class annotations: Array(0x0000000000000000) > > > - class type annotations: Array(0x0000000000000000) > > > - field annotations: Array(0x0000000000000000) > > > - field type annotations: Array(0x0000000000000000) > > > - inner classes: Array(0x00000008005af6d8) > > > - nest members: Array(0x00000008005af6d8) > > > - permitted subclasses: Array(0x00000008005af6d8) > > > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > > > - vtable length 5 (start addr: 0x0000000800c0b5b8) > > > - itable length 2 (start addr: 0x0000000800c0b5e0) > > > - ---- static fields (1 words): > > > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > > > - ---- non-static fields (0 words): > > > - non-static oop maps: > > > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > > > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > > > - instance size: 2 > > > - klass size: 62 > > > - access: final synchronized > > > - state: inited > > > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > > > - super: 'java/lang/Object' > > > - sub: > > > - arrays: NULL > > > - methods: Array(0x00007f620841ea68) > > > - method ordering: Array(0x0000000800a7e5a8) > > > - default_methods: Array(0x0000000000000000) > > > - local interfaces: Array(0x00000008005af748) > > > - trans. interfaces: Array(0x00000008005af748) > > > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > > > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > > > - source file: 'LambdaForm$DMH' > > > - class annotations: Array(0x0000000000000000) > > > - class type annotations: Array(0x0000000000000000) > > > - field annotations: Array(0x0000000000000000) > > > - field type annotations: Array(0x0000000000000000) > > > - inner classes: Array(0x00000008005af6d8) > > > - nest members: Array(0x00000008005af6d8) > > > - permitted subclasses: Array(0x00000008005af6d8) > > > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > > > - vtable length 5 (start addr: 0x0000000800c0b1b8) > > > - itable length 2 (start addr: 0x0000000800c0b1e0) > > > - ---- static fields (1 words): > > > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > > > - ---- non-static fields (0 words): > > > ... > > > > > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > use %4d > > Thank you for taking time to help review. 3 approval, I want to merge this tomorrow if no objections/comments. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7105 Hi Bernd, Class filter has been discussed before on PR(Please follow previous comments for more details). The conclusion is a filter won't save much runtime time. We can leave it to the external tools(grep/awk,etc). ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Mon Mar 7 04:01:58 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 04:01:58 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 18:39:33 GMT, Kim Barrett wrote: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This seems quite reasonable to allow. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7704 From dholmes at openjdk.java.net Mon Mar 7 06:57:22 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 06:57:22 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local Message-ID: Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - 8282721: HotSpot Style Guide should allow considered use of C++ thread_local Changes: https://git.openjdk.java.net/jdk/pull/7720/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282721 Stats: 18 lines in 2 files changed: 5 ins; 0 del; 13 mod Patch: https://git.openjdk.java.net/jdk/pull/7720.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7720/head:pull/7720 PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Mon Mar 7 07:03:21 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 07:03:21 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot Message-ID: This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. There are three parts: 1. Add the new #define for `thread_local` 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. 3. Remove the prohibition on using `thread_local` from the hotspot style guide Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. Testing: - manual testing of the Panama usecase as referenced in the JBS issue - Tiers 1-3 Thanks, David ------------- Commit messages: - 8282469: Allow considered use of C++ thread_local in Hotspot Changes: https://git.openjdk.java.net/jdk/pull/7719/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7719&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282469 Stats: 104 lines in 2 files changed: 4 ins; 100 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7719.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7719/head:pull/7719 PR: https://git.openjdk.java.net/jdk/pull/7719 From jzhu at openjdk.java.net Mon Mar 7 07:19:14 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Mon, 7 Mar 2022 07:19:14 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding Message-ID: I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. IntSpecies species(VectorShape s) returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. jtreg test passed without new failure. Please help review this change and let me know if any comments. [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 ------------- Commit messages: - 8282722: Regard mapping array in enum switches as stable for constant folding Changes: https://git.openjdk.java.net/jdk/pull/7721/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7721&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282722 Stats: 15 lines in 2 files changed: 15 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7721.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7721/head:pull/7721 PR: https://git.openjdk.java.net/jdk/pull/7721 From jiefu at openjdk.java.net Mon Mar 7 07:45:59 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 7 Mar 2022 07:45:59 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 Is there a jmh micro-ben to show the perf improvement? Or a jtreg test to show the inlining effect after this patch? Copyright year in `fieldInfo.hpp` needs to be updated. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From jzhu at openjdk.java.net Mon Mar 7 09:19:58 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Mon, 7 Mar 2022 09:19:58 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:42:23 GMT, Jie Fu wrote: > Is there a jmh micro-ben to show the perf improvement? Or a jtreg test to show the inlining effect after this patch? > > Copyright year in `fieldInfo.hpp` needs to be updated. > > Thanks. Thanks for your comments. This change is an optimization workable for all enum switches. Please check the example at http://cr.openjdk.java.net/~jzhu/8282722/ You can check the generated codes or IR graph of function "test2" for differences with/without this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From jbhateja at openjdk.java.net Mon Mar 7 09:24:59 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 7 Mar 2022 09:24:59 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 09:16:45 GMT, Joshua Zhu wrote: >> Is there a jmh micro-ben to show the perf improvement? >> Or a jtreg test to show the inlining effect after this patch? >> >> Copyright year in `fieldInfo.hpp` needs to be updated. >> >> Thanks. > >> Is there a jmh micro-ben to show the perf improvement? Or a jtreg test to show the inlining effect after this patch? >> >> Copyright year in `fieldInfo.hpp` needs to be updated. >> >> Thanks. > > Thanks for your comments. > This change is an optimization workable for all enum switches. > Please check the example at http://cr.openjdk.java.net/~jzhu/8282722/ > You can check the generated codes or IR graph of function "test2" for differences with/without this change. Hi @JoshuaZhuwj , Can we also augment mentioned method [1] with a ForceInline attribute. Since SPECIES itself is static final it's constantness will propagate down to its fields and enable expression folding. But your fix looks generic one. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From jzhu at openjdk.java.net Mon Mar 7 09:43:01 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Mon, 7 Mar 2022 09:43:01 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 09:16:45 GMT, Joshua Zhu wrote: >> Is there a jmh micro-ben to show the perf improvement? >> Or a jtreg test to show the inlining effect after this patch? >> >> Copyright year in `fieldInfo.hpp` needs to be updated. >> >> Thanks. > >> Is there a jmh micro-ben to show the perf improvement? Or a jtreg test to show the inlining effect after this patch? >> >> Copyright year in `fieldInfo.hpp` needs to be updated. >> >> Thanks. > > Thanks for your comments. > This change is an optimization workable for all enum switches. > Please check the example at http://cr.openjdk.java.net/~jzhu/8282722/ > You can check the generated codes or IR graph of function "test2" for differences with/without this change. > Hi @JoshuaZhuwj , Can we also augment mentioned method [1] with a @forceinline attribute. Since SPECIES itself is static final it's constantness will propagate down to its fields and enable expression folding. But your fix looks generic one. Jatin, this change is just like adding @Stable annotation to SwitchMap array (a translation table generated for enum switches in javac) to let c2's constant-folding take effect. I choose to implement this generic optimization so that class files in Java8 or Java11 could also benefit from this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From jbhateja at openjdk.java.net Mon Mar 7 10:43:56 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 7 Mar 2022 10:43:56 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 09:39:52 GMT, Joshua Zhu wrote: >>> Is there a jmh micro-ben to show the perf improvement? Or a jtreg test to show the inlining effect after this patch? >>> >>> Copyright year in `fieldInfo.hpp` needs to be updated. >>> >>> Thanks. >> >> Thanks for your comments. >> This change is an optimization workable for all enum switches. >> Please check the example at http://cr.openjdk.java.net/~jzhu/8282722/ >> You can check the generated codes or IR graph of function "test2" for differences with/without this change. > >> Hi @JoshuaZhuwj , Can we also augment mentioned method [1] with a @forceinline attribute. Since SPECIES itself is static final it's constantness will propagate down to its fields and enable expression folding. But your fix looks generic one. > > Jatin, this change is just like adding @Stable annotation to SwitchMap array (a translation table generated for enum switches in javac) to let c2's constant-folding take effect. I choose to implement this generic optimization so that class files in Java8 or Java11 could also benefit from this change. > > Hi @JoshuaZhuwj , Can we also augment mentioned method [1] with a @forceinline attribute. Since SPECIES itself is static final it's constantness will propagate down to its fields and enable expression folding. But your fix looks generic one. > > Jatin, this change is just like adding @stable annotation to SwitchMap array (a translation table generated for enum switches in javac) to let c2's constant-folding take effect. I choose to implement this generic optimization so that class files in Java8 or Java11 could also benefit from this change. Thanks @JoshuaZhuwj, a generic solution will address other cases too. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From shade at openjdk.java.net Mon Mar 7 10:59:04 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 7 Mar 2022 10:59:04 GMT Subject: RFR: 8282224: Correct TIG::bang_stack_shadow_pages comments In-Reply-To: References: Message-ID: On Tue, 22 Feb 2022 06:54:59 GMT, Aleksey Shipilev wrote: > When reviewing the RISC-V port of the change, I noticed the comment in the x86 code is worded incorrectly: > > > // Record a new watermark, unless the update is above the safe limit. > __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit())); > __ jccb(Assembler::belowEqual, L_done); > > > Stacks grow downwards, so we are recording a new watermark *when* update is above the safe limit. Friendly reminder, anyone has any... ahem... comments? ------------- PR: https://git.openjdk.java.net/jdk/pull/7569 From jbhateja at openjdk.java.net Mon Mar 7 11:24:58 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 7 Mar 2022 11:24:58 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 Fix is checking an integer array field name starting with "$SwitchMap" to be marked as stable if it is decorated with ACC_STATIC | ACC_FINAL attributes. Wandering why is not being honored during constant folding currently. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/ci/ciField.hpp#L129 // Is this field a constant? // // Clarification: A field is considered constant if: // 1. The field is both static and final // 2. The field is not one of the special static/final // non-constant fields. These are java.lang.System.in // and java.lang.System.out. Abomination. // // A field is also considered constant if // - it is marked @Stable and is non-null (or non-zero, if a primitive) or // - it is trusted or // - it is the target field of a CallSite object. // // See ciField::initialize_from() for more details. // // A user should also check the field value (constant_value().is_valid()), since // constant fields of non-initialized classes don't have values yet. bool is_constant() const { return _is_constant; } ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From duke at openjdk.java.net Mon Mar 7 11:36:21 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 11:36:21 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current Message-ID: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. ------------- Commit messages: - Fix include for threadWXSetters.inline.hpp - Remove thread parameter from ThreadWXEnable - Remove thread parameter from os methods - Remove wx_init and current thread assert in safefetch - Use os::current_thread_change_wx instead of thread methods Changes: https://git.openjdk.java.net/jdk/pull/7727/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282475 Stats: 150 lines in 31 files changed: 26 ins; 47 del; 77 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 11:49:05 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 11:49:05 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v12] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: <_rrSS4CsTkKoDWz5W_4jLioHjkQ_SQAID7NdqY3l6CI=.b2e2e4a8-2d11-490b-b0c4-86b72afe1b80@github.com> On Mon, 28 Feb 2022 16:28:27 GMT, Johannes Bechberger wrote: >> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method >> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix trailing whitespace The failing tests are related to https://bugs.openjdk.java.net/browse/JDK-8282475, fixed in https://github.com/openjdk/jdk/pull/7727 ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From jzhu at openjdk.java.net Mon Mar 7 12:02:06 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Mon, 7 Mar 2022 12:02:06 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: <7Wi-UdV2-wNKunrGXVkU0lCTCwszUdF_HqWOElr8cdk=.20418ea0-6825-44a8-b9f3-4820151c52a6@github.com> On Mon, 7 Mar 2022 11:20:52 GMT, Jatin Bhateja wrote: > Fix is checking an integer array field name starting with "$SwitchMap" to be marked as stable if it is decorated with ACC_STATIC | ACC_FINAL attributes. Wandering why is not being honored during constant folding currently. "static final int[] $SwitchMap$xxx" does not mean elements in the array are immutable. Hence it is necessary to treat mapping array generated in enum switches as stable so that their elements can be constant-folded. Please do not miss JVM_ACC_SYNTHETIC flag in this change. "A field marked with the ACC_SYNTHETIC flag indicates that it was generated by a compiler and does not appear in source code." ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From dholmes at openjdk.java.net Mon Mar 7 12:19:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 12:19:06 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 11:29:08 GMT, Johannes Bechberger wrote: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Hi Johannes, The general idea seems good (pity it touches so many files, but then I've never liked any of this WX support precisely because it is so invasive of shared code). I agree that safeFetch should not have become dependent on Thread::current existing, but I have to wonder whether we can just skip the WX code if there is no current thread? If the thread is not attached to the VM then what does it even mean to manipulate the WX state of an unknown thread? That aside, with this change I think we can move the conditional WX code out of the shared os.hpp and bury it down in os_bsd_aarch64.hpp where it actually belongs. I'd even like to see threadWXSetters.inline.hpp moved to being in src/os_cpu/bsd_aarch64/ if feasible - I'm not sure what include would be needed for the callsites to function - os.hpp I presume? Thanks, David src/hotspot/share/runtime/threadWXSetters.inline.hpp line 33: > 31: #if defined(__APPLE__) && defined(AARCH64) > 32: > 33: #include "runtime/thread.inline.hpp" // dependencies require this include I can't see how this include is needed now. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 12:29:03 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 12:29:03 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 12:07:13 GMT, David Holmes wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > src/hotspot/share/runtime/threadWXSetters.inline.hpp line 33: > >> 31: #if defined(__APPLE__) && defined(AARCH64) >> 32: >> 33: #include "runtime/thread.inline.hpp" // dependencies require this include > > I can't see how this include is needed now. I tried to replace it with os.hpp (and os.inline.hpp) but it caused a linker error. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 12:34:04 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 12:34:04 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 12:16:02 GMT, David Holmes wrote: > I agree that safeFetch should not have become dependent on Thread::current existing, but I have to wonder whether we can just skip the WX code if there is no current thread? If the thread is not attached to the VM then what does it even mean to manipulate the WX state of an unknown thread? The OS thread is always known. The WXMode is unrelated to Thread object. The WXMode is set for an OS thread to allow pages to be either writable or executable (needed for code generation). > That aside, with this change I think we can move the conditional WX code out of the shared os.hpp and bury it down in os_bsd_aarch64.hpp where it actually belongs. May I ask how that would affect the code that uses the methods (includes, ...)? > I'd even like to see threadWXSetters.inline.hpp moved to being in src/os_cpu/bsd_aarch64/ if feasible - I'm not sure what include would be needed for the callsites to function - os.hpp I presume? I don't know whether this is enough. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From eliu at openjdk.java.net Mon Mar 7 13:17:07 2022 From: eliu at openjdk.java.net (Eric Liu) Date: Mon, 7 Mar 2022 13:17:07 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 src/hotspot/share/oops/fieldInfo.hpp line 172: > 170: static const char *enum_switch_map_prefix = "$SwitchMap$"; > 171: static const char *enum_switch_map_sig = "[I"; > 172: static const jint required = JVM_ACC_SYNTHETIC | JVM_ACC_FINAL | JVM_ACC_STATIC; May I ask if this approach was only available for OpenJDK's java compiler? ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From duke at openjdk.java.net Mon Mar 7 13:22:01 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Mon, 7 Mar 2022 13:22:01 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v2] In-Reply-To: References: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com> Message-ID: On Thu, 3 Mar 2022 12:07:54 GMT, Boris Ulasevich wrote: >> src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 55: >> >>> 53: Label l; >>> 54: __ ldr(rscratch2, l); >>> 55: __ far_jump(ExternalAddress(entry_point), NULL, rscratch1, true); >> >> This complicates `assemble_ic_buffer_code`. You need to know `far_jump` implementation, especially the generation of NOPs. I understand why we need those NOPs. >> Do we have calls of non-nmethod code here? > > Yes, there are entry points from both non_method and method segments. I suggest the following code: int inst_count = __ far_jump(ExternalAddress(entry_point)); // IC stub code size is not expected to vary depending on target address. // We use NOPs to make the size equal to ic_stub_code_size. for (int i = 3 + inst_count; i < ic_stub_code_size(); ++i) { __ nop(); } ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From duke at openjdk.java.net Mon Mar 7 13:22:02 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Mon, 7 Mar 2022 13:22:02 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v3] In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 12:31:39 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 439: > 437: nop(); > 438: nop(); > 439: } I suggest to move the logic to `InlineCacheBuffer::assemble_ic_buffer_code` and change `far_jump` to return a number of instructions used for the jump. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From erikj at openjdk.java.net Mon Mar 7 13:44:05 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 7 Mar 2022 13:44:05 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Sat, 5 Mar 2022 06:49:16 GMT, Julian Waters wrote: > Should I change the JBS issue title to match the PR title, or is it preferred for the PR title to change? They need to match. You can either do it manually, or change the title to just the bug number and the bot will change it for you. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From coleenp at openjdk.java.net Mon Mar 7 14:06:01 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 7 Mar 2022 14:06:01 GMT Subject: RFR: 8282224: Correct TIG::bang_stack_shadow_pages comments In-Reply-To: References: Message-ID: On Tue, 22 Feb 2022 06:54:59 GMT, Aleksey Shipilev wrote: > When reviewing the RISC-V port of the change, I noticed the comment in the x86 code is worded incorrectly: > > > // Record a new watermark, unless the update is above the safe limit. > __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit())); > __ jccb(Assembler::belowEqual, L_done); > > > Stacks grow downwards, so we are recording a new watermark *when* update is above the safe limit. Yes, now comment matches the code. This looks good, can be checked in as trivial. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7569 From stuefe at openjdk.java.net Mon Mar 7 14:27:06 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 7 Mar 2022 14:27:06 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 11:29:08 GMT, Johannes Bechberger wrote: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Hi David, > The general idea seems good (pity it touches so many files, but then I've never liked any of this WX support precisely because it is so invasive of shared code). I agree that safeFetch should not have become dependent on Thread::current existing, but I have to wonder whether we can just skip the WX code if there is no current thread? If the thread is not attached to the VM then what does it even mean to manipulate the WX state of an unknown thread? We need to change the wx state of the current pthread in order to be able to execute stub routines. Otherwise, we would crash right away when trying to execute the SafeFetch stub. And that is a valid requirement. Let's say we crash in a native thread, unrelated to and completely oblivious of the JVM it shares the process with. We'd still want to see e.g. native crash information, stack frames, maybe register region information etc - all that stuff that may require SafeFetch. In fact, this patch is related to Johannes other PR where he modified stack frame walking to check that the registers point into valid memory. > > That aside, with this change I think we can move the conditional WX code out of the shared os.hpp and bury it down in os_bsd_aarch64.hpp where it actually belongs. Oh yes! > > I'd even like to see threadWXSetters.inline.hpp moved to being in src/os_cpu/bsd_aarch64/ if feasible - I'm not sure what include would be needed for the callsites to function - os.hpp I presume? I agree, all that wx stuff should be limited to os/bsd or os/bsd_aarch. We could have generic wrappers like: class os { ... // Platform does whatever needed to prepare for execution of generated code inside the current thread os::pre_current_thread_jit_call() NOT_MACOS_AARCH64({}) // Platform does whatever needed to clean up after executing generated code inside the current thread os::post_current_thread_jit_call() NOT_MACOS_AARCH64({}) (Macro does not yet exist, but MACOS_AARCH64_ONLY does) -- Side note, I think we have reached a point where it would be cleaner to split xxxBSD and MacOS sources. E.g. this wx stuff should be limited to MacOS too, and we have more and more `__APPLE_` only sections. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From forax at openjdk.java.net Mon Mar 7 14:41:57 2022 From: forax at openjdk.java.net (=?UTF-8?B?UsOpbWk=?= Forax) Date: Mon, 7 Mar 2022 14:41:57 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: <5HaQa1v4OPcosNXyq8FCyG7rbuViEC94hoGQnQjqGnE=.ab7b3af5-bd84-4ba8-883f-8d2d37932ee0@github.com> On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 Sadly, yes, ECJ the eclipse compiler compiles the enums differently. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From stuefe at openjdk.java.net Mon Mar 7 14:52:26 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 7 Mar 2022 14:52:26 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 11:29:08 GMT, Johannes Bechberger wrote: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Hi Johannes, just some drive-by comments, not a full review. Also please see my comment toward David, proposing a more generic interface in os instead. Cheers, Thomas src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 537: > 535: #endif > 536: > 537: static THREAD_LOCAL WXMode _wx_state = WXUnknown; All this wx coding inside bsd sources should be guarded with `__APPLE__` out of politeness toward the BSDs. src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 552: > 550: _wx_state = new_state; > 551: pthread_jit_write_protect_np(_wx_state == WXExec); > 552: } I would simplify this: if (_wx_state == unknown) { _wx_state = write; // No way to know but we assume the original state is "writable, not executable" } WXMode old = _wx_state; _wx_state = new_state; pthread_jit_write_protect_np(_wx_state == WXExec); } that is simpler and avoids calling pthread_jit_write_protect_np twice for the "unknown->exec" transition. src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 558: > 556: void os::current_thread_reset_wx() { > 557: current_thread_change_wx(WXWrite); > 558: } I find the naming a bit misleading. You use this as initialization, so I would call it "init" something. Then, I'm not sure it is even needed. I know you just transformed it from the original `init_wx()`, so the question is directed more at the original authors (@AntonKozlov?). AFAIU we use this to initialize wxstate for newly attached threads to "dont execute". But should this not already be the case? And if its not - e.g. because that thread had been calling into another library that also does call generated code - is it not impolite to switch the state to "executable false"? I know this is highly unlikely, I just try to understand. src/hotspot/share/runtime/os.hpp line 943: > 941: static WXMode current_thread_change_wx(WXMode new_state); > 942: > 943: static void current_thread_reset_wx(); Please add comments what this is supposed to do ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 15:11:09 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 15:11:09 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 14:42:56 GMT, Thomas Stuefe wrote: > I find the naming a bit misleading. You use this as initialization, so I would call it "init" something. You're correct. > AFAIU we use this to initialize wxstate for newly attached threads to "dont execute". But should this not already be the case? No, it does not seem to be the case: https://github.com/dotnet/runtime/issues/41991 (and the man page does not give a default value either) > And if its not - e.g. because that thread had been calling into another library that also does call generated code - is it not impolite to switch the state to "executable false"? I know this is highly unlikely, I just try to understand. This might be an issue when calling another library with a JIT. But this seems to be highly unlikely. Especially as this would have already been an issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 15:17:07 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 15:17:07 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 11:29:08 GMT, Johannes Bechberger wrote: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Regarding the names of the new methods: Most of the usages for ThreadWXEnable use it to set the WXMode to WXWrite. The suggested names are therefore a bit misleading (when used in this context). One could add another two methods: class os { ... // Platform does whatever needed to prepare for execution of generated code inside the current thread os::pre_current_thread_jit_code_gen() NOT_MACOS_AARCH64({}) // Platform does whatever needed to clean up after executing generated code inside the current thread os::post_current_thread_jit_code_gen() NOT_MACOS_AARCH64({}) But one would still have the problem of nesting (e.g. when code generating code calls code generating code). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From shade at openjdk.java.net Mon Mar 7 15:27:09 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 7 Mar 2022 15:27:09 GMT Subject: Integrated: 8282224: Correct TIG::bang_stack_shadow_pages comments In-Reply-To: References: Message-ID: On Tue, 22 Feb 2022 06:54:59 GMT, Aleksey Shipilev wrote: > When reviewing the RISC-V port of the change, I noticed the comment in the x86 code is worded incorrectly: > > > // Record a new watermark, unless the update is above the safe limit. > __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit())); > __ jccb(Assembler::belowEqual, L_done); > > > Stacks grow downwards, so we are recording a new watermark *when* update is above the safe limit. This pull request has now been integrated. Changeset: 8e70f4c3 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/8e70f4c3dca4cefe813c5b0fd39c386230ca2fd7 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8282224: Correct TIG::bang_stack_shadow_pages comments Reviewed-by: coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/7569 From shade at openjdk.java.net Mon Mar 7 15:27:08 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 7 Mar 2022 15:27:08 GMT Subject: RFR: 8282224: Correct TIG::bang_stack_shadow_pages comments In-Reply-To: References: Message-ID: <3ggcNBr1kYkax8LKHphcBH5HWzs2DEOtubJf9itaHFE=.48be2b54-5b65-4c9f-bf91-1abe909f1e93@github.com> On Tue, 22 Feb 2022 06:54:59 GMT, Aleksey Shipilev wrote: > When reviewing the RISC-V port of the change, I noticed the comment in the x86 code is worded incorrectly: > > > // Record a new watermark, unless the update is above the safe limit. > __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit())); > __ jccb(Assembler::belowEqual, L_done); > > > Stacks grow downwards, so we are recording a new watermark *when* update is above the safe limit. Thank you! ------------- PR: https://git.openjdk.java.net/jdk/pull/7569 From jwaters at openjdk.java.net Mon Mar 7 16:24:00 2022 From: jwaters at openjdk.java.net (Julian Waters) Date: Mon, 7 Mar 2022 16:24:00 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 13:40:48 GMT, Erik Joelsson wrote: > > Should I change the JBS issue title to match the PR title, or is it preferred for the PR title to change? > > They need to match. You can either do it manually, or change the title to just the bug number and the bot will change it for you. Alright, I can't change the title of the PR, I guess it'll be easier for me to change the issue instead ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From lancea at openjdk.java.net Mon Mar 7 16:43:08 2022 From: lancea at openjdk.java.net (Lance Andersen) Date: Mon, 7 Mar 2022 16:43:08 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo What problem are you having editing the PR header? You should be able to do so as the author of the PR ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From kcr at openjdk.java.net Mon Mar 7 16:52:07 2022 From: kcr at openjdk.java.net (Kevin Rushforth) Date: Mon, 7 Mar 2022 16:52:07 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 16:40:15 GMT, Lance Andersen wrote: > What problem are you having editing the PR header? You should be able to do so as the author of the PR Exactly. You should see an "Edit" button near the right edge of the PR title. See the attached image: ![PR-title](https://user-images.githubusercontent.com/34689748/157079404-eadbe8be-ae94-41e0-b17b-0d1a8026b9da.png) ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From kcr at openjdk.java.net Mon Mar 7 16:56:06 2022 From: kcr at openjdk.java.net (Kevin Rushforth) Date: Mon, 7 Mar 2022 16:56:06 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo But as the JBS title and PR title now match, this is a moot point. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From kcr at openjdk.java.net Mon Mar 7 17:18:03 2022 From: kcr at openjdk.java.net (Kevin Rushforth) Date: Mon, 7 Mar 2022 17:18:03 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 17:12:25 GMT, Magnus Ihse Bursie wrote: > TheShermanTanker is not the author of this PR, he's just assisting the author by creating the JBS issue. Ah, that explains it then. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From ihse at openjdk.java.net Mon Mar 7 17:18:03 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 7 Mar 2022 17:18:03 GMT Subject: RFR: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 16:40:15 GMT, Lance Andersen wrote: >> Hi >> >> I have reviewed the code for removing double semicolons at the end of lines >> >> all the best >> matteo > > What problem are you having editing the PR header? You should be able to do so as the author of the PR @LanceAndersen @kevinrushforth TheShermanTanker is not the author of this PR, he's just assisting the author by creating the JBS issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From kbarrett at openjdk.java.net Mon Mar 7 17:26:09 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 17:26:09 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:34:20 GMT, David Holmes wrote: > Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. > > This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. Looks good. Just a trivial formatting comment. doc/hotspot-style.md line 669: > 667: problems as for ordinary namespace-scoped variables. So we avoid use of > 668: `thread_local` in general, limiting its use to only those cases where dynamic > 669: initialization and destruction are essential. See [JDK-8282469](https://bugs.openjdk.java.net/browse/JDK-8282469) I'd like a line break between "See" and the bug link, to reduce the line length. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From kbarrett at openjdk.java.net Mon Mar 7 17:26:07 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 17:26:07 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:12:03 GMT, David Holmes wrote: > This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. > > There are three parts: > > 1. Add the new #define for `thread_local` > 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. > 3. Remove the prohibition on using `thread_local` from the hotspot style guide > > Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. > > Testing: > - manual testing of the Panama usecase as referenced in the JBS issue > - Tiers 1-3 > > Thanks, > David Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7719 From kbarrett at openjdk.java.net Mon Mar 7 17:33:07 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 17:33:07 GMT Subject: RFR: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description In-Reply-To: References: Message-ID: <3AsR6sK6B-EAIpRD-J1Z7AqDSSf2btcDqosbqGWZH-4=.16765c35-14af-4dc2-a150-b711d574a2f1@github.com> On Fri, 4 Mar 2022 16:50:31 GMT, Thomas Stuefe wrote: >> Please review this change to provide a link to the Wikipedia description of >> One-True-Brace-Style. >> >> As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot >> Group Members, though comments from others are welcome. > > LGTM. I did not know that this style had a name :) Thanks @tstuefe , @dcubed-ojdk , and @dholmes-ora for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7692 From kbarrett at openjdk.java.net Mon Mar 7 17:38:39 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 17:38:39 GMT Subject: RFR: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description [v2] In-Reply-To: References: Message-ID: > Please review this change to provide a link to the Wikipedia description of > One-True-Brace-Style. > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into style_guide_otbs - update html - add OTBS link ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7692/files - new: https://git.openjdk.java.net/jdk/pull/7692/files/445e8973..17bcee26 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7692&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7692&range=00-01 Stats: 2387 lines in 54 files changed: 1930 ins; 168 del; 289 mod Patch: https://git.openjdk.java.net/jdk/pull/7692.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7692/head:pull/7692 PR: https://git.openjdk.java.net/jdk/pull/7692 From kbarrett at openjdk.java.net Mon Mar 7 17:38:41 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 17:38:41 GMT Subject: Integrated: 8252577: HotSpot Style Guide should link to One-True-Brace-Style description In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 13:51:03 GMT, Kim Barrett wrote: > Please review this change to provide a link to the Wikipedia description of > One-True-Brace-Style. > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. This pull request has now been integrated. Changeset: 7194097b Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/7194097bcae7e0fd32488834277bb18cb97cea8b Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod 8252577: HotSpot Style Guide should link to One-True-Brace-Style description Reviewed-by: stuefe, dcubed, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7692 From kbarrett at openjdk.java.net Mon Mar 7 18:16:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 18:16:37 GMT Subject: RFR: 8257589: HotSpot Style Guide should link to rfc7282 [v2] In-Reply-To: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> References: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> Message-ID: > Please review this change to the link for the definition of "rough consensus". > The current link is to a Wikipedia article that references rfc7282. We should > instead link directly the the RFC. This change was requested during the > review of JDK-8247976, but not made at that time. (I'm not sure whether it was > intentionally deferred or missed/forgotten.) > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into rough_consensus_defn - update html - update rough consensus link ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7693/files - new: https://git.openjdk.java.net/jdk/pull/7693/files/8ba75e76..828a0dcc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7693&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7693&range=00-01 Stats: 2493 lines in 63 files changed: 2009 ins; 180 del; 304 mod Patch: https://git.openjdk.java.net/jdk/pull/7693.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7693/head:pull/7693 PR: https://git.openjdk.java.net/jdk/pull/7693 From kbarrett at openjdk.java.net Mon Mar 7 18:16:38 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 18:16:38 GMT Subject: RFR: 8257589: HotSpot Style Guide should link to rfc7282 [v2] In-Reply-To: References: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> Message-ID: On Fri, 4 Mar 2022 17:29:50 GMT, Daniel D. Daugherty wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into rough_consensus_defn >> - update html >> - update rough consensus link > > Thumbs up. Thanks @dcubed-ojdk and @dholmes-ora for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7693 From kbarrett at openjdk.java.net Mon Mar 7 18:16:39 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 18:16:39 GMT Subject: Integrated: 8257589: HotSpot Style Guide should link to rfc7282 In-Reply-To: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> References: <_JGRK3zhaaLCIUkkCULNFeHCmm26owJP6zNQXJ1VG_c=.b77d24ea-2a75-4f06-af15-afa5f203e51f@github.com> Message-ID: On Fri, 4 Mar 2022 14:04:10 GMT, Kim Barrett wrote: > Please review this change to the link for the definition of "rough consensus". > The current link is to a Wikipedia article that references rfc7282. We should > instead link directly the the RFC. This change was requested during the > review of JDK-8247976, but not made at that time. (I'm not sure whether it was > intentionally deferred or missed/forgotten.) > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. This pull request has now been integrated. Changeset: 5953b229 Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/5953b229bf6d7834d575862e7577522ded0b6791 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8257589: HotSpot Style Guide should link to rfc7282 Reviewed-by: dcubed, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7693 From kbarrett at openjdk.java.net Mon Mar 7 18:23:40 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 18:23:40 GMT Subject: RFR: 8272691: Fix HotSpot style guide terminology for "non-local variables" [v2] In-Reply-To: References: Message-ID: <_7P5_OpgorGz2dFugdcJ86Tc5E12jijZg73yGKwYRI0=.6f620825-dd03-4c2a-8c05-4f2f85184785@github.com> > Please review this fix to incorrect terminology used in one place. The > correct terminology (per C++14 3.6.2) is "non-local variables". > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into non-local-variables - update html - fix non-local variable term ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7695/files - new: https://git.openjdk.java.net/jdk/pull/7695/files/8d37a1b3..ed7ad991 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7695&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7695&range=00-01 Stats: 2495 lines in 63 files changed: 2009 ins; 180 del; 306 mod Patch: https://git.openjdk.java.net/jdk/pull/7695.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7695/head:pull/7695 PR: https://git.openjdk.java.net/jdk/pull/7695 From kbarrett at openjdk.java.net Mon Mar 7 18:23:42 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 18:23:42 GMT Subject: Integrated: 8272691: Fix HotSpot style guide terminology for "non-local variables" In-Reply-To: References: Message-ID: <5kjDr33ZIOXOkfkbAIeLGx30bOEJgRfpVQJnWslEN8g=.c576ab9e-2d75-48a7-910a-5714a2325997@github.com> On Fri, 4 Mar 2022 14:13:24 GMT, Kim Barrett wrote: > Please review this fix to incorrect terminology used in one place. The > correct terminology (per C++14 3.6.2) is "non-local variables". > > As this is a HotSpot Style Guide change, it requires reviewers who are HotSpot > Group Members, though comments from others are welcome. This pull request has now been integrated. Changeset: 2e298b8b Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/2e298b8bf45edc37269b8b70f7784082a8f87306 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8272691: Fix HotSpot style guide terminology for "non-local variables" Reviewed-by: dcubed, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7695 From kbarrett at openjdk.java.net Mon Mar 7 18:23:41 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Mar 2022 18:23:41 GMT Subject: RFR: 8272691: Fix HotSpot style guide terminology for "non-local variables" [v2] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 17:31:19 GMT, Daniel D. Daugherty wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into non-local-variables >> - update html >> - fix non-local variable term > > Thumbs up. Thanks @dcubed-ojdk and @dholmes-ora for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7695 From duke at openjdk.java.net Mon Mar 7 18:31:02 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 18:31:02 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 12:26:00 GMT, Johannes Bechberger wrote: >> src/hotspot/share/runtime/threadWXSetters.inline.hpp line 33: >> >>> 31: #if defined(__APPLE__) && defined(AARCH64) >>> 32: >>> 33: #include "runtime/thread.inline.hpp" // dependencies require this include >> >> I can't see how this include is needed now. > > I tried to replace it with os.hpp (and os.inline.hpp) but it caused a linker error. I was wrong, I removed it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 18:55:33 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 18:55:33 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v2] In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Minor fixes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7727/files - new: https://git.openjdk.java.net/jdk/pull/7727/files/3890c1e0..886a9354 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=00-01 Stats: 15 lines in 4 files changed: 3 ins; 3 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From jrose at openjdk.java.net Mon Mar 7 19:18:10 2022 From: jrose at openjdk.java.net (John R Rose) Date: Mon, 7 Mar 2022 19:18:10 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:12:03 GMT, David Holmes wrote: > This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. > > There are three parts: > > 1. Add the new #define for `thread_local` > 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. > 3. Remove the prohibition on using `thread_local` from the hotspot style guide > > Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. > > Testing: > - manual testing of the Panama usecase as referenced in the JBS issue > - Tiers 1-3 > > Thanks, > David src/hotspot/share/memory/operator_new.cpp line 37: > 35: // a memory leak. Use CHeapObj as the base class of such objects to make it explicit > 36: // that they're allocated on the C heap. > 37: // Commented out in product version to avoid conflicts with third-party C++ native code. There's a little bit of policy information here that is being deleted. It overlaps with the section `### Memory Allocation` in `hotspot-style.md`, but includes this information which might not be stated elsewhere: > Typically, uses of the C++ global operator new are inadvertent and therefore often associated with memory leaks. (This is my rephrasing, perhaps appropriate to the style guide.) Or is a point like this made in the config file which prevents direct linkage? (I don't know where that file is.) ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From jrose at openjdk.java.net Mon Mar 7 19:21:02 2022 From: jrose at openjdk.java.net (John R Rose) Date: Mon, 7 Mar 2022 19:21:02 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:34:20 GMT, David Holmes wrote: > Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. > > This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. This change goes with a deletion of the old backstop we have against using global operator new; it was some definitions of `operator new` which raise a call to `fatal`. That old file had a useful bit of lore, which perhaps should be tranplanted here. Suggested addition to rationale for not using global operator new: Typically, uses of the C++ global operator new are inadvertent and therefore often associated with memory leaks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From psandoz at openjdk.java.net Mon Mar 7 20:48:58 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Mon, 7 Mar 2022 20:48:58 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 This solution is potentially very fragile as it embeds knowledge of names into HotSpot and it lacks context. Although the chance is very small it could result in false positives, that would quite hard to track down. Instead we should consider modifying the java compiler to annotate the field with `@Stable`, although i think the problem there will be we don't apply to packages outside of those from java.base. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From dholmes at openjdk.java.net Mon Mar 7 21:25:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 21:25:06 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 17:23:18 GMT, Kim Barrett wrote: >> This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. >> >> There are three parts: >> >> 1. Add the new #define for `thread_local` >> 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. >> 3. Remove the prohibition on using `thread_local` from the hotspot style guide >> >> Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. >> >> Testing: >> - manual testing of the Panama usecase as referenced in the JBS issue >> - Tiers 1-3 >> >> Thanks, >> David > > Looks good. Thanks for the review @kimbarrett ! ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From dholmes at openjdk.java.net Mon Mar 7 21:25:07 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 21:25:07 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 19:14:52 GMT, John R Rose wrote: >> This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. >> >> There are three parts: >> >> 1. Add the new #define for `thread_local` >> 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. >> 3. Remove the prohibition on using `thread_local` from the hotspot style guide >> >> Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. >> >> Testing: >> - manual testing of the Panama usecase as referenced in the JBS issue >> - Tiers 1-3 >> >> Thanks, >> David > > src/hotspot/share/memory/operator_new.cpp line 37: > >> 35: // a memory leak. Use CHeapObj as the base class of such objects to make it explicit >> 36: // that they're allocated on the C heap. >> 37: // Commented out in product version to avoid conflicts with third-party C++ native code. > > There's a little bit of policy information here that is being deleted. It overlaps with the section `### Memory Allocation` in `hotspot-style.md`, but includes this information which might not be stated elsewhere: > >> Typically, uses of the C++ global operator new are inadvertent and therefore often associated with memory leaks. > > (This is my rephrasing, perhaps appropriate to the style guide.) > > Or is a point like this made in the config file which prevents direct linkage? (I don't know where that file is.) The link-time check is expressed in open/make/hotspot/lib/CompileJvm.gmk but simply states: # Hotspot disallows the use of global operators 'new' and 'delete'. This build # time check helps enforce this requirement. ... The prohibition on using global operator new should definitely be explicitly documented somewhere, so the style guide seems a suitable place. ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From dholmes at openjdk.java.net Mon Mar 7 21:36:08 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Mar 2022 21:36:08 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 14:24:01 GMT, Thomas Stuefe wrote: > We need to change the wx state of the current pthread in order to be able to execute stub routines. Otherwise, we would crash right away when trying to execute the SafeFetch stub. Oh I see - that is unfortunate. I don't like messing with other people's threads. > May I ask how that would affect the code that uses the methods (includes, ...)? @parttimenerd there would be no change - they continue to include os.hpp, which will include the os/cpu specific header files. > We could have generic wrappers like: ... @tstuefe I think this is going a little too far in this fix. I'm looking for simplicity. All the WX related code should be buried in the os/cpu file for BSD/Aarch64 and the callsites all using MACOS_AARCH64_ONLY. Splitting BSD from macOS would also be a future RFE. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 7 21:37:05 2022 From: duke at openjdk.java.net (Matteo Baccan) Date: Mon, 7 Mar 2022 21:37:05 GMT Subject: Integrated: 8282657: Code cleanup: removing double semicolons at the end of lines In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:39:31 GMT, Matteo Baccan wrote: > Hi > > I have reviewed the code for removing double semicolons at the end of lines > > all the best > matteo This pull request has now been integrated. Changeset: ccad3923 Author: Matteo Baccan Committer: Magnus Ihse Bursie URL: https://git.openjdk.java.net/jdk/commit/ccad39237ab860c5c5579537f740177e3f1adcc9 Stats: 93 lines in 82 files changed: 0 ins; 0 del; 93 mod 8282657: Code cleanup: removing double semicolons at the end of lines Reviewed-by: lancea, rriggs, ihse, prr, iris, wetmore, darcy, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7268 From duke at openjdk.java.net Mon Mar 7 21:41:01 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 7 Mar 2022 21:41:01 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v2] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 18:55:33 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Minor fixes Thanks. > All the WX related code should be buried in the os/cpu file for BSD/Aarch64 and the callsites all using MACOS_AARCH64_ONLY. I'm currently finishing a fix that exactly does that :) ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jzhu at openjdk.java.net Mon Mar 7 22:10:59 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Mon, 7 Mar 2022 22:10:59 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 20:44:34 GMT, Paul Sandoz wrote: > Although the chance is very small it could result in false positives, that would quite hard to track down. Paul, thanks for your review. Could you elaborate how false positives happens? ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From dcubed at openjdk.java.net Mon Mar 7 22:27:02 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 7 Mar 2022 22:27:02 GMT Subject: RFR: 8263134: HotSpot Style Guide should disallow inheriting constructors In-Reply-To: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> References: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Message-ID: On Fri, 4 Mar 2022 15:04:47 GMT, Kim Barrett wrote: > Please review this change to explicitly disallow the use of inheriting > constructors: > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2540.htm). > > The C++11/14 specification has a lot of problems. These were addressed in > C++17 (and as a DR that affects C++11/14): > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0136r1.html). > > Use of inheriting constructors now runs the risk of encountering those bugs, > inconsistent behavior between different compilers or compiler versions, and > behavior changes for future support of C++17. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7698 From psandoz at openjdk.java.net Mon Mar 7 22:30:02 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Mon, 7 Mar 2022 22:30:02 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 AFAICT the field detection is not specific to what javac generates. Although the likelihood is small, there might be other fields that start with the same prefix and that the same time? Further, if javac changes its code generation strategy then this code may become redundant for newly generated code, but we might need to keep it around and maybe add another specific case for new code. (Pattern matching for switch is and will like further impact javac's generation strategy, although i don't know if it will impact this particular area.) I understand the motivation, but the approach is fragile and is difficult to maintain over time. Unfortunately we don't yet have a general mechanism for read-only arrays. My recommendation is to instead change the code in the Vector API. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From dcubed at openjdk.java.net Mon Mar 7 22:30:59 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 7 Mar 2022 22:30:59 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 18:39:33 GMT, Kim Barrett wrote: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7704 From redestad at openjdk.java.net Mon Mar 7 23:08:49 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Mon, 7 Mar 2022 23:08:49 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v10] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Better implementation for aarch64 returning roughly the count of positive bytes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/85be36ae..81ef04ec Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=08-09 Stats: 34 lines in 3 files changed: 13 ins; 3 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Mon Mar 7 23:13:36 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Mon, 7 Mar 2022 23:13:36 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v11] In-Reply-To: References: Message-ID: <8p4ATe7aWCTiQ4umjuDmMO7mNozkF7S9aHvlYW0t7nI=.7f3ba36b-2f11-4a47-947e-f936d5929a0f@github.com> > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with two additional commits since the last revision: - use 32-bit mask to calculate correct remainder value - ary1 not required to have USE_KILL effect ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/81ef04ec..934b5b8a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=09-10 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From duke at openjdk.java.net Tue Mar 8 00:25:51 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 00:25:51 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v3] In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Move WX functionality into os specific files ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7727/files - new: https://git.openjdk.java.net/jdk/pull/7727/files/886a9354..a7c38e52 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=01-02 Stats: 163 lines in 30 files changed: 30 ins; 72 del; 61 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From jbhateja at openjdk.java.net Tue Mar 8 02:31:58 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 8 Mar 2022 02:31:58 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: <7Wi-UdV2-wNKunrGXVkU0lCTCwszUdF_HqWOElr8cdk=.20418ea0-6825-44a8-b9f3-4820151c52a6@github.com> References: <7Wi-UdV2-wNKunrGXVkU0lCTCwszUdF_HqWOElr8cdk=.20418ea0-6825-44a8-b9f3-4820151c52a6@github.com> Message-ID: On Mon, 7 Mar 2022 11:58:42 GMT, Joshua Zhu wrote: > > Fix is checking an integer array field name starting with "$SwitchMap" to be marked as stable if it is decorated with ACC_STATIC | ACC_FINAL attributes. Wandering why is not being honored during constant folding currently. > > "static final int[] $SwitchMap$xxx" does not mean elements in the array are immutable. Hence it is necessary to treat mapping array generated in enum switches as stable so that their elements can be constant-folded. Please do not miss JVM_ACC_SYNTHETIC flag in this change. "A field marked with the ACC_SYNTHETIC flag indicates that it was generated by a compiler and does not appear in source code." Thanks @JoshuaZhuwj , Got it, FTR references from JLS17 section 4.12.4 (final) and Stable documentation give the necessary details on semantics. https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/vm/annotation/Stable.java#L51 ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From yyang at openjdk.java.net Tue Mar 8 03:18:06 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 8 Mar 2022 03:18:06 GMT Subject: Integrated: 8275775: Add jcmd VM.classes to print details of all classes In-Reply-To: References: Message-ID: On Mon, 17 Jan 2022 09:31:54 GMT, Yi Yang wrote: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... This pull request has now been integrated. Changeset: 3f0684d0 Author: Yi Yang URL: https://git.openjdk.java.net/jdk/commit/3f0684d0b85662724af845a4ee6b97d9c5ceacbd Stats: 172 lines in 6 files changed: 171 ins; 0 del; 1 mod 8275775: Add jcmd VM.classes to print details of all classes Reviewed-by: dholmes, iklam, stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Tue Mar 8 04:08:08 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Mar 2022 04:08:08 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v3] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <3-7lQJObHUh4005vGNqLqMg8SNxYcqXJPllqD3bGXjg=.6372dde6-2867-4045-ae02-cc952ebada50@github.com> On Tue, 8 Mar 2022 00:25:51 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Move WX functionality into os specific files This is looking good. A few additional comments below. Thanks, David src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 538: > 536: > 537: #ifdef __APPLE__ > 538: static THREAD_LOCAL os::WXMode _wx_state = os::WXUnknown; Please add a blank line between the THREAD_LOCAL and the next method. Or even move this THREAD_LOCAL to just before `os::current_thread_change_wx`. src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 547: > 545: if (_wx_state == WXUnknown) { > 546: _wx_state = os::WXWrite; // No way to know but we assume the original state is "writable, not executable" > 547: } Given this can't you just initialize to WXWrite and do away with WXUnknown? src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 57: > 55: static void current_thread_init_wx(); > 56: > 57: static void current_thread_assert_wx_state(WXMode expected); Can we move all these into the ThreadWXEnable class so they are not in the os namespace? Even the enum could move - though it will make the use-sites a bit more verbose. I won't insist on pushing this WX stuff even deeper, but if anyone else thinks it is a good idea ... :) src/hotspot/share/prims/jni.cpp line 97: > 95: #include "utilities/macros.hpp" > 96: #include "utilities/vmError.hpp" > 97: #include "runtime/thread.inline.hpp" Why do we need this? Why do we not include os.hpp? src/hotspot/share/runtime/safefetch.inline.hpp line 31: > 29: > 30: #include "runtime/stubRoutines.hpp" > 31: #include "runtime/os.hpp" Please list the includes alphabetically. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Tue Mar 8 05:19:40 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Mar 2022 05:19:40 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: Message-ID: > Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. > > This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. David Holmes has updated the pull request incrementally with one additional commit since the last revision: Feedback from kbarrett and jrose. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7720/files - new: https://git.openjdk.java.net/jdk/pull/7720/files/a26c48fc..d0f8343b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=00-01 Stats: 6 lines in 2 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7720.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7720/head:pull/7720 PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Tue Mar 8 05:19:41 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Mar 2022 05:19:41 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 19:17:54 GMT, John R Rose wrote: > Suggested addition to rationale for not using global operator new: > > ``` > Typically, uses of the C++ global operator new are inadvertent and > therefore often associated with memory leaks. > ``` Added to the section on "Memory Allocation" as suggested with minor edit (dropped "C++"). ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Tue Mar 8 05:19:42 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Mar 2022 05:19:42 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 17:21:17 GMT, Kim Barrett wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback from kbarrett and jrose. > > doc/hotspot-style.md line 669: > >> 667: problems as for ordinary namespace-scoped variables. So we avoid use of >> 668: `thread_local` in general, limiting its use to only those cases where dynamic >> 669: initialization and destruction are essential. See [JDK-8282469](https://bugs.openjdk.java.net/browse/JDK-8282469) > > I'd like a line break between "See" and the bug link, to reduce the line length. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From jrose at openjdk.java.net Tue Mar 8 05:33:05 2022 From: jrose at openjdk.java.net (John R Rose) Date: Tue, 8 Mar 2022 05:33:05 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 05:19:40 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Feedback from kbarrett and jrose. Marked as reviewed by jrose (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From jzhu at openjdk.java.net Tue Mar 8 08:37:05 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Tue, 8 Mar 2022 08:37:05 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: <_fgxlEXvxncFvkqUreJ5Yo0WNyuaqhQoUGGmJmG8fPQ=.9b528744-d63c-49a0-9a45-4a2b36529061@github.com> On Mon, 7 Mar 2022 22:27:15 GMT, Paul Sandoz wrote: > AFAICT the field detection is not specific to what javac generates. Although the likelihood is small, there might be other fields that start with the same prefix and that the same type? > > Further, if javac changes its code generation strategy then this code may become redundant for newly generated code, but we might need to keep it around and maybe add another specific case for new code. (Pattern matching for switch is and will like further impact javac's generation strategy, although i don't know if it will impact this particular area.) > > I understand the motivation, but the approach is fragile and is difficult to maintain over time. Unfortunately we don't yet have a general mechanism for read-only arrays. > > My recommendation is to instead change the code in the Vector API. The following minor adjustment for IntVector.species() could be made to resolve the performance issue I mentioned. https://github.com/JoshuaZhuwj/jdk/commit/f817809d6846a5b64d014476d98da98a85933950 But it seems not elegant. How about adding @Stable annotation for the SwitchMap generated for enum switches in javac compiler instead. Although class files compiled by old version javac cannot be constant-folded, it overcomes the drawback that may be introduced by my initial change you described. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From duke at openjdk.java.net Tue Mar 8 08:41:07 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 08:41:07 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v3] In-Reply-To: <3-7lQJObHUh4005vGNqLqMg8SNxYcqXJPllqD3bGXjg=.6372dde6-2867-4045-ae02-cc952ebada50@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <3-7lQJObHUh4005vGNqLqMg8SNxYcqXJPllqD3bGXjg=.6372dde6-2867-4045-ae02-cc952ebada50@github.com> Message-ID: On Tue, 8 Mar 2022 03:51:03 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Move WX functionality into os specific files > > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 57: > >> 55: static void current_thread_init_wx(); >> 56: >> 57: static void current_thread_assert_wx_state(WXMode expected); > > Can we move all these into the ThreadWXEnable class so they are not in the os namespace? Even the enum could move - though it will make the use-sites a bit more verbose. I won't insist on pushing this WX stuff even deeper, but if anyone else thinks it is a good idea ... :) I'm open for suggestions, but putting it there was the simplest way. The problem is that os is not a namespace, but a class. But this could and should probably be changed. > src/hotspot/share/prims/jni.cpp line 97: > >> 95: #include "utilities/macros.hpp" >> 96: #include "utilities/vmError.hpp" >> 97: #include "runtime/thread.inline.hpp" > > Why do we need this? Why do we not include os.hpp? We need this, because it does not compile (linker error) otherwise. But I forgot to include os.hpp (but included by thread.inline.hpp). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Tue Mar 8 09:06:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Mar 2022 09:06:06 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v3] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <3-7lQJObHUh4005vGNqLqMg8SNxYcqXJPllqD3bGXjg=.6372dde6-2867-4045-ae02-cc952ebada50@github.com> Message-ID: On Tue, 8 Mar 2022 08:36:41 GMT, Johannes Bechberger wrote: >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 57: >> >>> 55: static void current_thread_init_wx(); >>> 56: >>> 57: static void current_thread_assert_wx_state(WXMode expected); >> >> Can we move all these into the ThreadWXEnable class so they are not in the os namespace? Even the enum could move - though it will make the use-sites a bit more verbose. I won't insist on pushing this WX stuff even deeper, but if anyone else thinks it is a good idea ... :) > > I'm open for suggestions, but putting it there was the simplest way. The problem is that os is not a namespace, but a class. But this could and should probably be changed. I was suggesting pushing everything in to os::ThreadWXEnable class. >> src/hotspot/share/prims/jni.cpp line 97: >> >>> 95: #include "utilities/macros.hpp" >>> 96: #include "utilities/vmError.hpp" >>> 97: #include "runtime/thread.inline.hpp" >> >> Why do we need this? Why do we not include os.hpp? > > We need this, because it does not compile (linker error) otherwise. But I forgot to include os.hpp (but included by thread.inline.hpp). But you didn't add anything that needs it - in fact you deleted `thread->enable_wx` - so perhaps the linker error was from a different variant of the fix? ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Tue Mar 8 09:10:02 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 09:10:02 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v3] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <3-7lQJObHUh4005vGNqLqMg8SNxYcqXJPllqD3bGXjg=.6372dde6-2867-4045-ae02-cc952ebada50@github.com> Message-ID: On Tue, 8 Mar 2022 08:59:50 GMT, David Holmes wrote: >> I'm open for suggestions, but putting it there was the simplest way. The problem is that os is not a namespace, but a class. But this could and should probably be changed. > > I was suggesting pushing everything in to os::ThreadWXEnable class. I don't think that this would be good as this would lead to fairly long calls. I would rather create a class wx and place everything in this (renaming ThreadWXEnable to Enable). >> We need this, because it does not compile (linker error) otherwise. But I forgot to include os.hpp (but included by thread.inline.hpp). > > But you didn't add anything that needs it - in fact you deleted `thread->enable_wx` - so perhaps the linker error was from a different variant of the fix? You're right. That's weird. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Tue Mar 8 10:32:46 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 10:32:46 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Move code to os::current_thread_wx - Small fixes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7727/files - new: https://git.openjdk.java.net/jdk/pull/7727/files/a7c38e52..02048306 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=02-03 Stats: 100 lines in 28 files changed: 15 ins; 11 del; 74 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Tue Mar 8 12:10:00 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 12:10:00 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Tue, 8 Mar 2022 10:32:46 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Move code to os::current_thread_wx > - Small fixes I don't know why the Linux x86 build fails. I tested the current version with code related to #7591 and it seems to fix the remaining problems (I tested it also with NMT enabled). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Tue Mar 8 12:39:12 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Mar 2022 12:39:12 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Tue, 8 Mar 2022 10:32:46 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Move code to os::current_thread_wx > - Small fixes The Linux x86 build failure is not related to this and has already been fixed, so you should re-sync with master branch. src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 45: > 43: #ifdef __APPLE__ > 44: > 45: class current_thread_wx { This violates the style guide for class names. It would be CurrentThreadWX - but ThreadWX seems sufficient to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Tue Mar 8 13:11:06 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 13:11:06 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Tue, 8 Mar 2022 12:33:10 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Move code to os::current_thread_wx >> - Small fixes > > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 45: > >> 43: #ifdef __APPLE__ >> 44: >> 45: class current_thread_wx { > > This violates the style guide for class names. It would be CurrentThreadWX - but ThreadWX seems sufficient to me. But os is okay? I just use this name for grouping. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Tue Mar 8 13:17:29 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 8 Mar 2022 13:17:29 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v5] In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Move code to os::current_thread_wx - Small fixes - Move WX functionality into os specific files - Minor fixes - Fix include for threadWXSetters.inline.hpp - Remove thread parameter from ThreadWXEnable - Remove thread parameter from os methods - Remove wx_init and current thread assert in safefetch - Use os::current_thread_change_wx instead of thread methods ------------- Changes: https://git.openjdk.java.net/jdk/pull/7727/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=04 Stats: 241 lines in 32 files changed: 52 ins; 111 del; 78 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From redestad at openjdk.java.net Tue Mar 8 13:54:02 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 8 Mar 2022 13:54:02 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v11] In-Reply-To: <8p4ATe7aWCTiQ4umjuDmMO7mNozkF7S9aHvlYW0t7nI=.7f3ba36b-2f11-4a47-947e-f936d5929a0f@github.com> References: <8p4ATe7aWCTiQ4umjuDmMO7mNozkF7S9aHvlYW0t7nI=.7f3ba36b-2f11-4a47-947e-f936d5929a0f@github.com> Message-ID: On Mon, 7 Mar 2022 23:13:36 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with two additional commits since the last revision: > > - use 32-bit mask to calculate correct remainder value > - ary1 not required to have USE_KILL effect aarch64: https://jmh.morethan.io/?gists=281ac3c29ef85f9f64c0440cd7f8c247,0a2c7d3b803f9cd5799f6af95eb6a90a Brings decent gains on the "sunshine case" and the mixed microbenchmarks, but there are a few glaring exceptions which I'm still investigating. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From kbarrett at openjdk.java.net Tue Mar 8 16:13:10 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 8 Mar 2022 16:13:10 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: Message-ID: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> On Tue, 8 Mar 2022 05:19:40 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Feedback from kbarrett and jrose. Changes requested by kbarrett (Reviewer). doc/hotspot-style.md line 468: > 466: (operator new and related functions). Typically, uses of the global > 467: operator new are inadvertent and therefore often associated with memory > 468: leaks. Use of these functions by HotSpot code is disabled for some platforms. I don't agree with the new sentence about uses of global operator new. "Normal" C++ use of global operator new is no more associated with memory leaks than are the other allocations we do in HotSpot. The rationale for disallowing use of global operator new in HotSpot code (as I understand it) is that we want all of our heap allocations to be trackable via NMT. Any uses of global operator new would bypass that. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From psandoz at openjdk.java.net Tue Mar 8 16:33:07 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 8 Mar 2022 16:33:07 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 Unfortunately the `@Stable` annotation can only be used by classes in the `java.base` module and other select modules. It cannot be used generally by application code given its unsafe nature. We need read-only/frozen/constant arrays to do this properly in code outside of the JDK. Perhaps there is a clever alternative strategy javac could use, e.g. using a bootstrap method although nothing specific comes to mind at this moment. I would hold off on that until the code generation strategy for patterns in switch settles down. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From jrose at openjdk.java.net Tue Mar 8 17:13:12 2022 From: jrose at openjdk.java.net (John R Rose) Date: Tue, 8 Mar 2022 17:13:12 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> References: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> Message-ID: On Tue, 8 Mar 2022 16:09:52 GMT, Kim Barrett wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback from kbarrett and jrose. > > doc/hotspot-style.md line 468: > >> 466: (operator new and related functions). Typically, uses of the global >> 467: operator new are inadvertent and therefore often associated with memory >> 468: leaks. Use of these functions by HotSpot code is disabled for some platforms. > > I don't agree with the new sentence about uses of global operator new. "Normal" C++ use of global operator new is no more associated with memory leaks than are the other allocations we do in HotSpot. The rationale for disallowing use of global operator new in HotSpot code (as I understand it) is that we want all of our heap allocations to be trackable via NMT. Any uses of global operator new would bypass that. First, it's not exactly a new sentence, just one moved from elsewhere in our code base (from a file that was deleted in the companion PR to this one). Second, it is true; we have seen problems in the (distant) past of exactly the form claimed. The problem is that HotSpot is an irregular user of C++, including via assembly code and tortuous stack frame manipulation (deopt handlers etc.). It's easy to accidentally emit a use of of global `op new` through ten layers of C++ header file, and in HotSpot it's also easy to break the careful matching of constructors to destructors that C++ relies on. The result is a storage leak. Kim, I could see you thinking, also, that this sort of observation doesn't belong in a style guide, and a lot of these nuggets might tend to bloat which obscures the useful parts of the style guide. (An over-long guide is not a useful guide after all.) You might suggest where this rationale information goes, if not here. But I think it fits well enough here. And if it isn't inserted here, or some other new place, it will be lost because of David's file deletion in the other PR related to this one. I don't want it to get lost. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From mdoerr at openjdk.java.net Tue Mar 8 17:29:07 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 8 Mar 2022 17:29:07 GMT Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory ordering [v11] In-Reply-To: References: Message-ID: On Thu, 2 Sep 2021 16:06:45 GMT, Aleksey Shipilev wrote: >> Shenandoah carries forwardee information in object's mark word. Installing the new mark word is effectively "releasing" the object copy, and reading from the new mark word is "acquiring" that object copy. >> >> For the forwardee update side, Hotspot's default for atomic operations is memory_order_conservative, which emits two-way memory fences around the CASes at least on AArch64 and PPC64. This seems to be excessive for Shenandoah forwardee updates, and "release" is enough. >> >> The reader side is much more interesting, because we generally want "consume", but it is not available. We can do "acquire", but it regresses performance all too much. The close inspection of the code reveals we need "acquire" on many paths, but not on the most critical one: heap updates. This must explain why current weaker reader side was never seen to fail, and this also opens a way to get `acquire`-in-lieu-of-`consume` without the observable performance penalty. >> >> The relaxation in forwardee installation improves concurrent evacuation quite visibly. See for example GC cycle times with SPECjvm2008, Compiler.sunflow on AArch64: >> >> Before: >> >> >> [info][gc,stats] Concurrent Evacuation = 3.421 s (a = 21247 us) (n = 161) >> [info][gc,stats] Concurrent Evacuation = 3.584 s (a = 21080 us) (n = 170) >> [info][gc,stats] Concurrent Evacuation = 3.226 s (a = 21088 us) (n = 153) >> [info][gc,stats] Concurrent Evacuation = 3.270 s (a = 20827 us) (n = 157) >> [info][gc,stats] Concurrent Evacuation = 3.339 s (a = 20742 us) (n = 161) >> >> >> After: >> >> [info][gc,stats] Concurrent Evacuation = 3.109 s (a = 18617 us) (n = 167) >> [info][gc,stats] Concurrent Evacuation = 3.027 s (a = 18918 us) (n = 160) >> [info][gc,stats] Concurrent Evacuation = 2.862 s (a = 17669 us) (n = 162) >> [info][gc,stats] Concurrent Evacuation = 2.858 s (a = 17425 us) (n = 164) >> [info][gc,stats] Concurrent Evacuation = 2.883 s (a = 17685 us) (n = 163) >> >> >> Additional testing: >> - [x] Linux x86_64 `hotspot_gc_shenandoah` >> - [x] Linux AArch64 `hotspot_gc_shenandoah` >> - [x] Linux x86_64 `tier1` with Shenandoah >> - [x] Linux AArch64 `tier1` with Shenandoah > > Aleksey Shipilev has updated the pull request incrementally with three additional commits since the last revision: > > - More natural order of arguments > - Move the fwdptr-related updaters to ShenandoahForwarding > - Avoid acq_rel that is promoted to seq_cst on ARM <8.3 Is this PR still planned? Should we test it on PPC64? ------------- PR: https://git.openjdk.java.net/jdk/pull/2496 From forax at openjdk.java.net Tue Mar 8 17:39:08 2022 From: forax at openjdk.java.net (=?UTF-8?B?UsOpbWk=?= Forax) Date: Tue, 8 Mar 2022 17:39:08 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 16:29:56 GMT, Paul Sandoz wrote: >> I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. >> >> For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. >> >> IntSpecies species(VectorShape s) >> >> returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] >> >> With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. >> >> jtreg test passed without new failure. >> Please help review this change and let me know if any comments. >> >> [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 > > Unfortunately the `@Stable` annotation can only be used by classes in the `java.base` module and other select modules. It cannot be used generally by application code given its unsafe nature. > > We need read-only/frozen/constant arrays to do this properly in code outside of the JDK. > > Perhaps there is a clever alternative strategy javac could use, e.g. using a bootstrap method although nothing specific comes to mind at this moment. I would hold off on that until the code generation strategy for patterns in switch settles down. @PaulSandoz, i believe you can add "Stable" to the VarHandle API, as a kind of access like plain, opaque, volatile, etc. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From iklam at openjdk.java.net Tue Mar 8 19:29:33 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Mar 2022 19:29:33 GMT Subject: RFR: 8253495: CDS generates non-deterministic output Message-ID: This patch makes the result of "java -Xshare:dump" deterministic: - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp - Fixed a problem in hashtable ordering in heapShared.cpp - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). Testing under way: - tier1~tier5 - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). ------------- Commit messages: - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2 - fixed test - more fixes - 8253495: CDS generates non-deterministic output Changes: https://git.openjdk.java.net/jdk/pull/7748/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253495 Stats: 73 lines in 12 files changed: 49 ins; 7 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From erikj at openjdk.java.net Tue Mar 8 19:55:05 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 8 Mar 2022 19:55:05 GMT Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 19:11:02 GMT, Ioi Lam wrote: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). compare.sh looks good. Can't comment on the rest. Thank you for fixing this! ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7748 From lucy at openjdk.java.net Tue Mar 8 20:15:05 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 8 Mar 2022 20:15:05 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v11] In-Reply-To: <8p4ATe7aWCTiQ4umjuDmMO7mNozkF7S9aHvlYW0t7nI=.7f3ba36b-2f11-4a47-947e-f936d5929a0f@github.com> References: <8p4ATe7aWCTiQ4umjuDmMO7mNozkF7S9aHvlYW0t7nI=.7f3ba36b-2f11-4a47-947e-f936d5929a0f@github.com> Message-ID: On Mon, 7 Mar 2022 23:13:36 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with two additional commits since the last revision: > > - use 32-bit mask to calculate correct remainder value > - ary1 not required to have USE_KILL effect Changes look good. Due to lack of specific expertise, I can't fully approve aarch64 and x86 changes. ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7231 From dcubed at openjdk.java.net Tue Mar 8 21:22:08 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 8 Mar 2022 21:22:08 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 05:19:40 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Feedback from kbarrett and jrose. The white space changes threw me for a minute and then I remembered to review the content changes via .md file and not via the .html file... Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From dcubed at openjdk.java.net Tue Mar 8 21:22:09 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 8 Mar 2022 21:22:09 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> Message-ID: On Tue, 8 Mar 2022 17:09:53 GMT, John R Rose wrote: >> doc/hotspot-style.md line 468: >> >>> 466: (operator new and related functions). Typically, uses of the global >>> 467: operator new are inadvertent and therefore often associated with memory >>> 468: leaks. Use of these functions by HotSpot code is disabled for some platforms. >> >> I don't agree with the new sentence about uses of global operator new. "Normal" C++ use of global operator new is no more associated with memory leaks than are the other allocations we do in HotSpot. The rationale for disallowing use of global operator new in HotSpot code (as I understand it) is that we want all of our heap allocations to be trackable via NMT. Any uses of global operator new would bypass that. > > First, it's not exactly a new sentence, just one moved from elsewhere in our code base (from a file that was deleted in the companion PR to this one). > > Second, it is true; we have seen problems in the (distant) past of exactly the form claimed. The problem is that HotSpot is an irregular user of C++, including via assembly code and tortuous stack frame manipulation (deopt handlers etc.). It's easy to accidentally emit a use of of global `op new` through ten layers of C++ header file, and in HotSpot it's also easy to break the careful matching of constructors to destructors that C++ relies on. The result is a storage leak. > > Kim, I could see you thinking, also, that this sort of observation doesn't belong in a style guide, and a lot of these nuggets might tend to bloat which obscures the useful parts of the style guide. (An over-long guide is not a useful guide after all.) You might suggest where this rationale information goes, if not here. But I think it fits well enough here. And if it isn't inserted here, or some other new place, it will be lost because of David's file deletion in the other PR related to this one. I don't want it to get lost. I would also like that sentence to not be lost. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From dcubed at openjdk.java.net Tue Mar 8 21:36:08 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 8 Mar 2022 21:36:08 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:12:03 GMT, David Holmes wrote: > This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. > > There are three parts: > > 1. Add the new #define for `thread_local` > 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. > 3. Remove the prohibition on using `thread_local` from the hotspot style guide > > Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. > > Testing: > - manual testing of the Panama usecase as referenced in the JBS issue > - Tiers 1-3 > > Thanks, > David Thumbs up. I had to reread these two sentences a couple of times: 2a) Remove operator_new.cpp as use of C++ thread_local with a non-trival cleanup actions requires use of global operators new/delete. 2b) These are still excluded for hotspot use via a link-time check. I think I've convinced myself that the link-time check won't run afoul of the uses that are planned for Panama. I did that by rephrasing the above into: > Panama's use of thread_local will have non-trival cleanup actions and those > will not be complained about by the link-time check. Please let me know if I have this correct. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7719 From dcubed at openjdk.java.net Tue Mar 8 21:36:08 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 8 Mar 2022 21:36:08 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 21:21:28 GMT, David Holmes wrote: >> src/hotspot/share/memory/operator_new.cpp line 37: >> >>> 35: // a memory leak. Use CHeapObj as the base class of such objects to make it explicit >>> 36: // that they're allocated on the C heap. >>> 37: // Commented out in product version to avoid conflicts with third-party C++ native code. >> >> There's a little bit of policy information here that is being deleted. It overlaps with the section `### Memory Allocation` in `hotspot-style.md`, but includes this information which might not be stated elsewhere: >> >>> Typically, uses of the C++ global operator new are inadvertent and therefore often associated with memory leaks. >> >> (This is my rephrasing, perhaps appropriate to the style guide.) >> >> Or is a point like this made in the config file which prevents direct linkage? (I don't know where that file is.) > > The link-time check is expressed in open/make/hotspot/lib/CompileJvm.gmk but simply states: > > > # Hotspot disallows the use of global operators 'new' and 'delete'. This build > # time check helps enforce this requirement. ... > > > The prohibition on using global operator new should definitely be explicitly documented somewhere, so the style guide seems a suitable place. I agree that the prohibition on using global operator new should definitely be explicitly documented somewhere and I'm good with adding that warning to the HotSpot style guide. ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From iklam at openjdk.java.net Wed Mar 9 05:10:44 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Mar 2022 05:10:44 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Fixed zero build ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/1fb3f830..44db40f1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=00-01 Stats: 5 lines in 2 files changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Wed Mar 9 06:52:58 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Mar 2022 06:52:58 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Tue, 8 Mar 2022 13:08:13 GMT, Johannes Bechberger wrote: >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.hpp line 45: >> >>> 43: #ifdef __APPLE__ >>> 44: >>> 45: class current_thread_wx { >> >> This violates the style guide for class names. It would be CurrentThreadWX - but ThreadWX seems sufficient to me. > > But os is okay? I just use this name for grouping. "os" is okay. Its historical. Its also more of a namespace really. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 07:04:00 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 07:04:00 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Wed, 9 Mar 2022 06:49:37 GMT, Thomas Stuefe wrote: >> But os is okay? I just use this name for grouping. > > "os" is okay. Its historical. Its also more of a namespace really. but current_thread_wx would be too? Maybe I could change both to namespaces? ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 07:04:00 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 07:04:00 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Wed, 9 Mar 2022 07:00:29 GMT, Johannes Bechberger wrote: >> "os" is okay. Its historical. Its also more of a namespace really. > > but current_thread_wx would be too? Maybe I could change both to namespaces? But the style guide has no opinions on them... ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Wed Mar 9 07:07:58 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 07:07:58 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 05:10:44 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Fixed zero build I have reservations about contorting things this way just to get "deterministic output". The VM needs to fully initialize and then become quiescent before the dump occurs, and as I say below if you don't start other threads then you potentially remove part of the archive because classes won't be loaded by those threads. I think if you care about the order of dumping classes then you should control that order, you don't try to force the order of loading. Can't you sort things before dumping? ie rehash/rebuild the hashtables etc so it has a canonical ordering? I see this was mentioned in the bug report and is considered a largish/complex fix, but it would be the proper fix IMO. Thanks, David src/hotspot/share/prims/jvm.cpp line 2873: > 2871: // execute in parallel, symbols and classes may be loaded in > 2872: // random orders which will make the resulting CDS archive > 2873: // non-deterministic. Yes but by not starting these threads you are potentially excluding a range of classes from the shared archive! ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Wed Mar 9 07:18:06 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Mar 2022 07:18:06 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 06:49:02 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > src/hotspot/share/prims/jvm.cpp line 2873: > >> 2871: // execute in parallel, symbols and classes may be loaded in >> 2872: // random orders which will make the resulting CDS archive >> 2873: // non-deterministic. > > Yes but by not starting these threads you are potentially excluding a range of classes from the shared archive! `java -Xshare:dump` loads all classes specified in a classlist, which is created without this thread-disabling hack. The number of classes in the CDS archive is the same before/after this PR. The size of the CDS archive is identical. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From dholmes at openjdk.java.net Wed Mar 9 07:29:04 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 07:29:04 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 21:33:14 GMT, Daniel D. Daugherty wrote: > Please let me know if I have this correct. @dcubed-ojdk you statement is not wrong but it an odd way of rephrasing things. The current change has no affect on the link-time check, so if a panama use would run afoul of that check before, then it will still run afoul of it. But I don't expect it to run afoul in either case. The problem I'm fixing with this part of the change is that the C++ library will currently link to our (booby-trapped) definitions of global operators new and delete, where it needs the real ones, so I deleted our definitions. We still don't want hotspot code to use these global external definitions though and that is where the pre-existing link-time check comes in - we check all libjvm component object files for symbols referring to the disallowed global methods. Hope that clarifies. ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From dholmes at openjdk.java.net Wed Mar 9 07:34:02 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 07:34:02 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Tue, 8 Mar 2022 12:07:08 GMT, Johannes Bechberger wrote: >> Johannes Bechberger has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. > > I don't know why the Linux x86 build fails. > > I tested the current version with code related to #7591 and it seems to fix the remaining problems (I tested it also with NMT enabled). @parttimenerd please never force-push in an active review as it completely destroys the review history and comment context! ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Wed Mar 9 07:34:02 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 07:34:02 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Wed, 9 Mar 2022 07:01:06 GMT, Johannes Bechberger wrote: >> but current_thread_wx would be too? Maybe I could change both to namespaces? > > But the style guide has no opinions on them... If/when the styleguide has an opinion on namespaces I would expect the same naming style to apply as for Classes. Hotspot is full of historical quirks like "class os" I'm afraid. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From iklam at openjdk.java.net Wed Mar 9 07:36:02 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Mar 2022 07:36:02 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 07:04:56 GMT, David Holmes wrote: > I have reservations about contorting things this way just to get "deterministic output". > > The VM needs to fully initialize and then become quiescent before the dump occurs, and as I say below if you don't start other threads then you potentially remove part of the archive because classes won't be loaded by those threads. > > I think if you care about the order of dumping classes then you should control that order, you don't try to force the order of loading. Can't you sort things before dumping? ie rehash/rebuild the hashtables etc so it has a canonical ordering? I see this was mentioned in the bug report and is considered a largish/complex fix, but it would be the proper fix IMO. > > Thanks, David I tried the "proper" approach but it's very complicated. I already have an implementation that sorts all the metadata. However, the CDS archive also contains a heap dump, which includes Java HashMaps. If I allow those 3 Java threads to start, some HashMaps in the module graph will have unstable ordering. I think the reason is concurrent thread execution causes unstable assignment of the identity_hash for objects in the heap dump. I cannot think of a clean way to fix this. The alternative, disabling Java thread starts, is much simpler and much more appealing to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Wed Mar 9 08:02:01 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Mar 2022 08:02:01 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 05:10:44 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Fixed zero build Hi Ioi, some questions, comments inline. Like David in the comments, I am also a bit vague on the usefulness, but I may not know the whole story. Is it to enable repackagers like Debian to check the "reproducable" tickbox on their OpenJDK package? Or is there a practical need for this? Thanks, Thomas src/hotspot/share/prims/jvm.cpp line 2887: > 2885: return; > 2886: } > 2887: #endif Should we do this for jni_AttachCurrentThread too? src/hotspot/share/utilities/hashtable.hpp line 42: > 40: > 41: LP64_ONLY(unsigned int _gap;) > 42: For 64-bit, you now lose packing potential in the theoretical case the following payload does not have to be aligned to 64 bit. E.g. for T=char, where the whole entry would fit into 8 bytes. Probably does not matter as long as entries are allocated individually from C-heap which is a lot more wasteful anyway. For 32-bit, I think you may have the same problem if the payload starts with a uint64_t. Would that not be aligned to a 64-bit boundary too? Whether or not you build on 64-bit? I think setting the memory, or at least the first 8..16 bytes, of the entry to zero in BasicHashtable::new_entry could be more robust: (16 bytes in case the payload starts with a long double but that may be overthinking it :) template BasicHashtableEntry* BasicHashtable::new_entry(unsigned int hashValue) { char* p = :new (NEW_C_HEAP_ARRAY(char, this->entry_size(), F); ::memset(p, 0, MIN2(this->entry_size(), 16)); // needs reproducable BasicHashtableEntry* entry = ::new (p) BasicHashtableEntry(hashValue); return entry; } If you are worried about performance, this may also be controlled by a template parameter, and then you do it just for the system dictionary. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Wed Mar 9 08:11:08 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Mar 2022 08:11:08 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> Message-ID: On Tue, 8 Mar 2022 21:18:02 GMT, Daniel D. Daugherty wrote: >> First, it's not exactly a new sentence, just one moved from elsewhere in our code base (from a file that was deleted in the companion PR to this one). >> >> Second, it is true; we have seen problems in the (distant) past of exactly the form claimed. The problem is that HotSpot is an irregular user of C++, including via assembly code and tortuous stack frame manipulation (deopt handlers etc.). It's easy to accidentally emit a use of of global `op new` through ten layers of C++ header file, and in HotSpot it's also easy to break the careful matching of constructors to destructors that C++ relies on. The result is a storage leak. >> >> Kim, I could see you thinking, also, that this sort of observation doesn't belong in a style guide, and a lot of these nuggets might tend to bloat which obscures the useful parts of the style guide. (An over-long guide is not a useful guide after all.) You might suggest where this rationale information goes, if not here. But I think it fits well enough here. And if it isn't inserted here, or some other new place, it will be lost because of David's file deletion in the other PR related to this one. I don't want it to get lost. > > I would also like that sentence to not be lost. Would this not be something for a footnote or an addendum? ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From duke at openjdk.java.net Wed Mar 9 08:20:05 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 08:20:05 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v5] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <_MOmqVYDaAuWfkuGPnPfm10pUCWb0nbH7fedPf4llSo=.7288c328-7ce5-4537-9a88-7b1c989f3f55@github.com> On Tue, 8 Mar 2022 13:17:29 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Move code to os::current_thread_wx > - Small fixes > - Move WX functionality into os specific files > - Minor fixes > - Fix include for threadWXSetters.inline.hpp > - Remove thread parameter from ThreadWXEnable > - Remove thread parameter from os methods > - Remove wx_init and current thread assert in safefetch > - Use os::current_thread_change_wx instead of thread methods Oh, I did not know that. Sorry for that, I just wanted to rebase it and forgot that this would change all the commit ids. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 08:35:41 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 08:35:41 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: current_thread_wx -> ThreadWX ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7727/files - new: https://git.openjdk.java.net/jdk/pull/7727/files/21dd0046..f206e6d2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=04-05 Stats: 61 lines in 28 files changed: 0 ins; 0 del; 61 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 08:35:42 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 08:35:42 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: <58tuExRUjjGeYwpYnN14FywexGhLde7PLxdaBqyCMfM=.ee1f9800-85a4-441c-b289-fc4653c1843b@github.com> On Wed, 9 Mar 2022 07:28:46 GMT, David Holmes wrote: >> But the style guide has no opinions on them... > > If/when the styleguide has an opinion on namespaces I would expect the same naming style to apply as for Classes. > > Hotspot is full of historical quirks like "class os" I'm afraid. I changed it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Wed Mar 9 08:40:07 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Mar 2022 08:40:07 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v5] In-Reply-To: <_MOmqVYDaAuWfkuGPnPfm10pUCWb0nbH7fedPf4llSo=.7288c328-7ce5-4537-9a88-7b1c989f3f55@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <_MOmqVYDaAuWfkuGPnPfm10pUCWb0nbH7fedPf4llSo=.7288c328-7ce5-4537-9a88-7b1c989f3f55@github.com> Message-ID: On Wed, 9 Mar 2022 08:16:51 GMT, Johannes Bechberger wrote: > Oh, I did not know that. Sorry for that, I just wanted to rebase it and forgot that this would change all the commit ids. You couldn't know. Just merge "master", which achieves the same and leaves the history intact. Additional tip. You can start a PR in draft mode. Reviews won't start until you un-draft. In draft mode, you have plenty of time to flesh out the patch and fix all bugs that show up in GHAs. In that phase, it's still okay to rebase. Once you consider the patch ready, you can un-draft the PR and the reviews start. Only then you should not rebase anymore. Starting out with a draft PR has also the advantage of not using reviewer time on an incomplete patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From tschatzl at openjdk.java.net Wed Mar 9 10:14:35 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 9 Mar 2022 10:14:35 GMT Subject: RFR: 8278492: Parameter -XX:MinRAMPercentage has no effect Message-ID: Hi all, can I have reviews for this change that makes the `MinRAMPercentage` flag actually affect minimum heap size? Testing: gha, test case Thanks, Thomas ------------- Commit messages: - Initial version Changes: https://git.openjdk.java.net/jdk/pull/7755/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7755&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8278492 Stats: 97 lines in 2 files changed: 96 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7755.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7755/head:pull/7755 PR: https://git.openjdk.java.net/jdk/pull/7755 From stuefe at openjdk.java.net Wed Mar 9 10:15:07 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Mar 2022 10:15:07 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Wed, 9 Mar 2022 07:30:43 GMT, David Holmes wrote: >> I don't know why the Linux x86 build fails. >> >> I tested the current version with code related to #7591 and it seems to fix the remaining problems (I tested it also with NMT enabled). > > @parttimenerd please never force-push in an active review as it completely destroys the review history and comment context! Hi @dholmes-ora , @parttimenerd I'd like to argue again for my proposal from before. All this is contrary to how we deal with platform dependencies normally. Normally, we 1) either keep the whole code in platform specific places - including callers. If that is possible. 2) If there are very few callers in shared places, we keep implementation in platform branch and #ifdef the callsites in shared code out to the affected platforms 3) If there are many callers in shared code or if it looks like it may be useful on other platforms too at some point, we usually wrap the logic in a platform generic function behind a descriptive name, which we stub out for the unaffected platforms. (3) is very common. My proposal from before would make it possible to really hide all WX logic from shared code: shared/runtime/os.hpp ``` class os { ... // Platform specific hook to prepare the current thread for calling generated code void enable_jit_calls_for_current_thread() NOT_MACOS_AARCH64({}) // Platform specific hook to clean up the current thread after calling into generated code void disable_jit_calls_for_current_thread() NOT_MACOS_AARCH64({}) class ThreadEnableJitCallsMark: public StackObj { public: ThreadEnableJitCallsMark() { enable_jit_calls_for_current_thread(); } ~ThreadEnableJitCallsMark() { disable_jit_calls_for_current_thread(); } } (ThreadEnableJitCallsMark could be optionally spread out into separate include) os.bsd_aarch64.cpp void os::enable_jit_calls_for_current_thread() { ... blabla ... pthread_jit_write_protect_np(); } void os::disable_jit_calls_for_current_thread() { ... blabla ... pthread_jit_write_protect_np(); } Thats very little code. It effectively hides all platform details where they belong, away from shared code. In shared code, you use either one of `os::(enable|disable)_jit_calls_for_current_thread()` or the companion `os::ThreadEnableJitCallsMark`. Advantages would be: - Call sites in shared code are now easier to mentally parse. `os::disable_jit_calls_for_current_thread()` is much clearer than `MACOS_AARCH64_ONLY(os::ThreadWX::Enable __wx(os::ThreadWX::Write));`. - We don't need MAC_AARCH64_ONLY in shared code - We don't need the enums in shared code. Dont need to understand what they do, nor the difference between "Write" and "Exec". Side note, I'd also suggest a different name for the RAII object to something like "xxxMark". That is more according to hotspot customs. ---- Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Wed Mar 9 11:03:08 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 11:03:08 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: <0JaqT9Wqyj0RuIFJFGGXOdL6471wKfu8EP0_WjcbwG8=.d50bf928-ce4d-4fea-8d4d-491f1e01b4d7@github.com> On Wed, 9 Mar 2022 10:11:06 GMT, Thomas Stuefe wrote: >> @parttimenerd please never force-push in an active review as it completely destroys the review history and comment context! > > Hi @dholmes-ora , @parttimenerd > > I'd like to argue again for my proposal from before. > > All this is contrary to how we deal with platform dependencies normally. Normally, we > 1) either keep the whole code in platform specific places - including callers. If that is possible. > 2) If there are very few callers in shared places, we keep implementation in platform branch and #ifdef the callsites in shared code out to the affected platforms > 3) If there are many callers in shared code or if it looks like it may be useful on other platforms too at some point, we usually wrap the logic in a platform generic function behind a descriptive name, which we stub out for the unaffected platforms. > > (3) is very common. > > My proposal from before would make it possible to really hide all WX logic from shared code: > > shared/runtime/os.hpp > ``` > class os { > ... > // Platform specific hook to prepare the current thread for calling generated code > void enable_jit_calls_for_current_thread() NOT_MACOS_AARCH64({}) > // Platform specific hook to clean up the current thread after calling into generated code > void disable_jit_calls_for_current_thread() NOT_MACOS_AARCH64({}) > > class ThreadEnableJitCallsMark: public StackObj { > public: > ThreadEnableJitCallsMark() { enable_jit_calls_for_current_thread(); } > ~ThreadEnableJitCallsMark() { disable_jit_calls_for_current_thread(); } > } > > > > (ThreadEnableJitCallsMark could be optionally spread out into separate include) > > > os.bsd_aarch64.cpp > > void os::enable_jit_calls_for_current_thread() { > ... blabla ... pthread_jit_write_protect_np(); > } > > void os::disable_jit_calls_for_current_thread() { > ... blabla ... pthread_jit_write_protect_np(); > } > > > > Thats very little code. It effectively hides all platform details where they belong, away from shared code. In shared code, you use either one of `os::(enable|disable)_jit_calls_for_current_thread()` or the companion `os::ThreadEnableJitCallsMark`. Advantages would be: > > - Call sites in shared code are now easier to mentally parse. `os::disable_jit_calls_for_current_thread()` is much clearer than `MACOS_AARCH64_ONLY(os::ThreadWX::Enable __wx(os::ThreadWX::Write));`. > - We don't need MAC_AARCH64_ONLY in shared code > - We don't need the enums in shared code. Dont need to understand what they do, nor the difference between "Write" and "Exec". > > Side note, I'd also suggest a different name for the RAII object to something like "xxxMark". That is more according to hotspot customs. > > ---- > > Cheers, Thomas @tstuefe do you have some examples of (3)? I don't like introducing a fake, seemingly general-purpose, API for something that is very much platform specific. I do dislike intensely the way the ThreadWX changes pollute shared code, and as has been said in other reviews of that code, there really should be a much cleaner/clearer place where these transitions occur - if we can find it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 11:21:02 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 11:21:02 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> Message-ID: On Wed, 9 Mar 2022 10:11:06 GMT, Thomas Stuefe wrote: > Call sites in shared code are now easier to mentally parse. os::disable_jit_calls_for_current_thread() is much clearer than MACOS_AARCH64_ONLY(os::ThreadWX::Enable __wx(os::ThreadWX::Write)); That's not enough, as I wrote before, the RAII object is still needed for nesting. But I agree, that using better names would improve readability (like having two RAII objects for specific purposes). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Wed Mar 9 11:27:01 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 11:27:01 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX A couple of minor nits but otherwise I'm okay with this version. Thanks, David src/hotspot/cpu/aarch64/jniFastGetField_aarch64.cpp line 35: > 33: #include "prims/jvmtiExport.hpp" > 34: #include "runtime/safepoint.hpp" > 35: #include "runtime/thread.inline.hpp" I still don't see why this is needed. src/hotspot/cpu/aarch64/jniFastGetField_aarch64.cpp line 77: > 75: template::jni_type)> > 76: JniType static_fast_get_field_wrapper(JNIEnv *env, jobject obj, jfieldID fieldID) { > 77: JavaThread* thread = JavaThread::thread_from_jni_environment(env); This line is no longer needed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 11:27:02 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 11:27:02 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 11:14:09 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> current_thread_wx -> ThreadWX > > src/hotspot/cpu/aarch64/jniFastGetField_aarch64.cpp line 35: > >> 33: #include "prims/jvmtiExport.hpp" >> 34: #include "runtime/safepoint.hpp" >> 35: #include "runtime/thread.inline.hpp" > > I still don't see why this is needed. I must have forgotten this line during refactoring. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jbhateja at openjdk.java.net Wed Mar 9 11:38:34 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 9 Mar 2022 11:38:34 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v12] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 - 8279508: Removing +LogCompilation flag. - 8279508: Review comments resolved.` - 8279508: Adding descriptive comments. - 8279508: Review comments resolved. - 8279508: Review comments resolved. - 8279508: Fixing for windows failure. - 8279508: Adding few descriptive comments. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 - ... and 5 more: https://git.openjdk.java.net/jdk/compare/d07f7c76...547f4e31 ------------- Changes: https://git.openjdk.java.net/jdk/pull/7094/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=11 Stats: 752 lines in 24 files changed: 660 ins; 30 del; 62 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From dholmes at openjdk.java.net Wed Mar 9 11:49:05 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Mar 2022 11:49:05 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 05:10:44 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Fixed zero build The "heap dump" aspect of this is not something I'm familiar with, but if the threads don't affect the list of classes dumped, they surely must affect what is in the heap dump otherwise their execution would not be an issue. So you must be sacrificing something by not having these threads start. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Wed Mar 9 11:55:05 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Mar 2022 11:55:05 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v4] In-Reply-To: <0JaqT9Wqyj0RuIFJFGGXOdL6471wKfu8EP0_WjcbwG8=.d50bf928-ce4d-4fea-8d4d-491f1e01b4d7@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2emw-rdoUshgEurchU32RBqIHeWVqvD9ZCDTxq-QExg=.b20d6f6e-1bd9-4f49-84af-6c6f80ae1c7f@github.com> <0JaqT9Wqyj0RuIFJFGGXOdL6471wKfu8EP0_WjcbwG8=.d50bf928-ce4d-4fea-8d4d-491f1e01b4d7@github.com> Message-ID: <-3iFQkq7PJ6VpX9rCset1xvOiD4UAWRwaaeCX5Uh4r0=.89cd0175-2d12-4f85-ac0a-1b3e6b5a3351@github.com> On Wed, 9 Mar 2022 11:00:15 GMT, David Holmes wrote: > @tstuefe do you have some examples of (3)? I don't like introducing a fake, seemingly general-purpose, API for something that is very much platform specific. I do dislike intensely the way the ThreadWX changes pollute shared code, and as has been said in other reviews of that code, there really should be a much cleaner/clearer place where these transitions occur - if we can find it. Examples for platform specifics hidden behind a common facade with one or two platforms missing are very common, but I assume you mean implementations that are stubbed out almost everywhere, right? "os::os_exception_wrapper" is a very good example, hiding the details of SEH - which only exists on Windows - from generic code. It is an empty stub implementation on all platforms but Windows x86. On all other platforms (including windows aarch) we don't have that facility. - "os::breakpoint" is a bit similar, since native breakpoint support only exists on Windows - "os::set_native_thread_name" used to initially only exist on windows and linux, leaving out aix, mac and solaris. I can dig deeper, but I remember generic wrappers with only a few or only one platform implementation being around. Especially when we still had Solaris around. I remember because like you I never was too fond of that pattern, but I find it often the smaller evil. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jzhu at openjdk.java.net Wed Mar 9 12:48:31 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Wed, 9 Mar 2022 12:48:31 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap Message-ID: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. Please help review. ------------- Commit messages: - 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap Changes: https://git.openjdk.java.net/jdk/pull/7757/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7757&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282874 Stats: 42 lines in 7 files changed: 0 ins; 0 del; 42 mod Patch: https://git.openjdk.java.net/jdk/pull/7757.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7757/head:pull/7757 PR: https://git.openjdk.java.net/jdk/pull/7757 From jzhu at openjdk.java.net Wed Mar 9 13:13:03 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Wed, 9 Mar 2022 13:13:03 GMT Subject: RFR: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 16:29:56 GMT, Paul Sandoz wrote: > Unfortunately the `@Stable` annotation can only be used by classes in the `java.base` module and other select modules. It cannot be used generally by application code given its unsafe nature. > > We need read-only/frozen/constant arrays to do this properly in code outside of the JDK. > > Perhaps there is a clever alternative strategy javac could use, e.g. using a bootstrap method although nothing specific comes to mind at this moment. I would hold off on that until the code generation strategy for patterns in switch settles down. Got it. Paul, thanks for your detailed explanation. As you suggested, I made a change in VectorAPI. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From bulasevich at openjdk.java.net Wed Mar 9 14:13:38 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Wed, 9 Mar 2022 14:13:38 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v4] In-Reply-To: References: Message-ID: > Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. > > In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. > > As a side effect, the performance of some tests is slightly improved: > ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` > > Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH Boris Ulasevich has updated the pull request incrementally with two additional commits since the last revision: - moving nops out of far_jump - minor renaming ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7517/files - new: https://git.openjdk.java.net/jdk/pull/7517/files/5f0fe37c..91e62888 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=02-03 Stats: 20 lines in 6 files changed: 6 ins; 5 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/7517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517 PR: https://git.openjdk.java.net/jdk/pull/7517 From yyang at openjdk.java.net Wed Mar 9 15:34:23 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 9 Mar 2022 15:34:23 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries Message-ID: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. ------------- Commit messages: - 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries Changes: https://git.openjdk.java.net/jdk/pull/7760/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7760&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282883 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/7760.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7760/head:pull/7760 PR: https://git.openjdk.java.net/jdk/pull/7760 From duke at openjdk.java.net Wed Mar 9 16:10:31 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Wed, 9 Mar 2022 16:10:31 GMT Subject: RFR: 8282881: Print exception message in VM crash with -XX:AbortVMOnException Message-ID: In `Exceptions::debug_check_abort`, we crash the VM if the exception matches with `-XX:AbortVMOnException`. For example `-XX:AbortVMOnException=java.lang.RuntimeEx`. Currently, in the VM crash description, we only print the exception name (`value_string`), and not its message (`message`). For completeness and consistency, we should also print the exception message. I tested it with these two exceptions, the first results in `message` being `NULL`: `throw new RuntimeException();` `throw new RuntimeException("some message");` Running tests to make sure nothing else broke. ------------- Commit messages: - 8282881: Print exception message in VM crash with -XX:AbortVMOnException Changes: https://git.openjdk.java.net/jdk/pull/7762/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7762&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282881 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7762.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7762/head:pull/7762 PR: https://git.openjdk.java.net/jdk/pull/7762 From akozlov at openjdk.java.net Wed Mar 9 16:38:07 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 9 Mar 2022 16:38:07 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX The change proposes to assume WXWrite as the initial state. Have you considered to extend ThreadWXEnable to fix the assert failure? Something like below (I have not tried to compile though). The refactoring looks OK, but it makes sense to separate it from functional change. class ThreadWXEnable { Thread* _thread; WXMode _old_mode; public: ThreadWXEnable(WXMode new_mode, Thread* thread) : _thread(thread) { if (_thread) { _old_mode = _thread->enable_wx(new_mode); } else { os::current_thread_enable_wx(new_mode); _old_mode = WXWrite; } } ~ThreadWXEnable() { if (_thread) { _thread->enable_wx(_old_mode); } else { os::current_thread_enable_wx(_old_mode); } } }; ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Wed Mar 9 16:47:04 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 16:47:04 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX This probably does not work as intended, as it would prevent nesting of ThreadWXEnable before a thread exists. I already proposed a version without any refactoring (just moving the current WXMode from Thread into os near the related methods): https://github.com/openjdk/jdk/pull/7727/commits/478ec1a7ca2c72e5916b28613a4875aa2ee1a793 (state at this commit). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From redestad at openjdk.java.net Wed Mar 9 16:52:54 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 9 Mar 2022 16:52:54 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v12] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Restructure encodeUTF8 to reduce code gen issues ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/934b5b8a..3d155c87 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=10-11 Stats: 21 lines in 1 file changed: 8 ins; 5 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From dcubed at openjdk.java.net Wed Mar 9 16:57:00 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 9 Mar 2022 16:57:00 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:12:03 GMT, David Holmes wrote: > This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. > > There are three parts: > > 1. Add the new #define for `thread_local` > 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. > 3. Remove the prohibition on using `thread_local` from the hotspot style guide > > Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. > > Testing: > - manual testing of the Panama usecase as referenced in the JBS issue > - Tiers 1-3 > > Thanks, > David Thanks for the clarification. It does help. ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From psandoz at openjdk.java.net Wed Mar 9 18:25:50 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 9 Mar 2022 18:25:50 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: <2ymNIjRZYsZBliTF39gfwhCVB8bKka-ejXpD7zZ8y1g=.92e0d4c3-5b5b-480f-87a1-90ffe16a3522@github.com> On Wed, 9 Mar 2022 12:33:49 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. > Please help review. Marked as reviewed by psandoz (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From akozlov at openjdk.java.net Wed Mar 9 19:06:51 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 9 Mar 2022 19:06:51 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX https://github.com/openjdk/jdk/compare/master...478ec1a7ca2c72e5916b28613a4875aa2ee1a793 touches more places than a targeted change in ThreadWXEnable... I'm not sure the real nesting is required for a thread that is not registered properly with VM. The initial state is always assumed for the NULL Thread. The SafeFetch assembly does not do up-calls to VM. I don't see why we'd need runtime tracking of WX state. The state is either WXExec for SafeFetch assembly, or unknown -- which we assume to be WXWrite regardless of approach taken. Nesting was implemented to reduce the amount of changes in JVM (yes, WX code scattered around the VM less than it could be :)), but it is possible to avoid runtime WX tracking if you always know the state, like we do if Thread == NULL. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From kbarrett at openjdk.java.net Wed Mar 9 20:12:47 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 9 Mar 2022 20:12:47 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> Message-ID: On Tue, 8 Mar 2022 17:09:53 GMT, John R Rose wrote: >> doc/hotspot-style.md line 468: >> >>> 466: (operator new and related functions). Typically, uses of the global >>> 467: operator new are inadvertent and therefore often associated with memory >>> 468: leaks. Use of these functions by HotSpot code is disabled for some platforms. >> >> I don't agree with the new sentence about uses of global operator new. "Normal" C++ use of global operator new is no more associated with memory leaks than are the other allocations we do in HotSpot. The rationale for disallowing use of global operator new in HotSpot code (as I understand it) is that we want all of our heap allocations to be trackable via NMT. Any uses of global operator new would bypass that. > > First, it's not exactly a new sentence, just one moved from elsewhere in our code base (from a file that was deleted in the companion PR to this one). > > Second, it is true; we have seen problems in the (distant) past of exactly the form claimed. The problem is that HotSpot is an irregular user of C++, including via assembly code and tortuous stack frame manipulation (deopt handlers etc.). It's easy to accidentally emit a use of of global `op new` through ten layers of C++ header file, and in HotSpot it's also easy to break the careful matching of constructors to destructors that C++ relies on. The result is a storage leak. > > Kim, I could see you thinking, also, that this sort of observation doesn't belong in a style guide, and a lot of these nuggets might tend to bloat which obscures the useful parts of the style guide. (An over-long guide is not a useful guide after all.) You might suggest where this rationale information goes, if not here. But I think it fits well enough here. And if it isn't inserted here, or some other new place, it will be lost because of David's file deletion in the other PR related to this one. I don't want it to get lost. If that sentence (or something like it) were at the end of the following "rationale" paragraph (where I think @rose00 originally suggested) then I'd be okay with it, as that would put it in the scope of the NMT requirement. Maybe even remove "typically" in that location. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From duke at openjdk.java.net Wed Mar 9 20:28:41 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Wed, 9 Mar 2022 20:28:41 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX Interesting. But I would nonetheless create an assertion that checks that there is no nesting in the case without a Thread object. I would this using a thread local nesting counter in the ThreadWXEnable class (incremented in the constructor and decremented in the destructor). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From redestad at openjdk.java.net Wed Mar 9 20:56:46 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 9 Mar 2022 20:56:46 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v12] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 16:52:54 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Restructure encodeUTF8 to reduce code gen issues The regressions I observe on aarch64 in `encodeLatin1Short` and a few others are not in the intrinsic itself but due changes to the surrounding code. Reverting the changes to `String.encodeUTF8` removes the regressions (but also the improvements). Seems some loop optimization is not taking place like it should - or just differently. Going back to check I see that x64 is also affected, meaning this is something that has come in when syncing up with master. I've experimented with adjusting the code to try and workaround and improve code gen, but with only partial success. I'll back out the changes to `String.encodeUTF8`, see if we can deal with the loop opt regression (separately) and return to do the `encodeUTF8` optimization later on. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Wed Mar 9 21:01:13 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 9 Mar 2022 21:01:13 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v13] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 40 commits: - Merge branch 'master' into count_positives - Restructure encodeUTF8 to reduce code gen issues - use 32-bit mask to calculate correct remainder value - ary1 not required to have USE_KILL effect - Better implementation for aarch64 returning roughly the count of positive bytes - Document that it's allowed for implementations to return values less than the exact count (iff there are negative bytes) - Clean out and remove vmIntrinsics::_hasNegatives and all related code - s390 impl provided by @RealLucy - PPC impl provided by @TheRealMDoerr - Narrow the bottom_type of CountPositivesNode (always results in a positive int value) - ... and 30 more: https://git.openjdk.java.net/jdk/compare/ff766204...30739e15 ------------- Changes: https://git.openjdk.java.net/jdk/pull/7231/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=12 Stats: 638 lines in 36 files changed: 288 ins; 62 del; 288 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Wed Mar 9 23:44:22 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 9 Mar 2022 23:44:22 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v14] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Revert encodeUTF8 for this PR due issues with fragile optimization ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/30739e15..58ee73bb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=12-13 Stats: 18 lines in 1 file changed: 0 ins; 9 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Wed Mar 9 23:52:44 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 9 Mar 2022 23:52:44 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v14] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 23:44:22 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Revert encodeUTF8 for this PR due issues with fragile optimization Reverting changes to `String.encodeUTF8` brought all `encode`-micros down to effectively no change: https://jmh.morethan.io/?gist=b957cb9457c31141ac71d47f2e10486a (which proves implementing `hasNegatives` using `countPositives != len` has no measurable cost) I consider this the final version for this PR (assuming tests pass). I need someone to review the aarch64 changes in particular, and perhaps someone from the core library team should sign off on the String changes (less of those now). ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Wed Mar 9 23:59:32 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 9 Mar 2022 23:59:32 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v15] In-Reply-To: References: Message-ID: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Fix copyright year in new test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/58ee73bb..bc5a8c80 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=14 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=13-14 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From pli at openjdk.java.net Thu Mar 10 01:22:43 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Thu, 10 Mar 2022 01:22:43 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Mon, 21 Feb 2022 06:19:26 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into postloop > > Change-Id: I503edb75f0f626569c776416bfef09651935979c > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Since reviewers don't know much about the background of this patch, I have created some slides explaining current problem, the motivation and our plans. You could find my slides at http://cr.openjdk.java.net/~pli/slides/JDK-8183390.pdf ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From dholmes at openjdk.java.net Thu Mar 10 01:31:11 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 01:31:11 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v3] In-Reply-To: References: Message-ID: > Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. > > This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. David Holmes has updated the pull request incrementally with one additional commit since the last revision: Move new sentence to 'rationale' paragraph. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7720/files - new: https://git.openjdk.java.net/jdk/pull/7720/files/d0f8343b..c307466c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=01-02 Stats: 8 lines in 2 files changed: 2 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7720.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7720/head:pull/7720 PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Thu Mar 10 01:31:12 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 01:31:12 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 05:19:40 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Feedback from kbarrett and jrose. There are some trailing whitespace corrections also caught up in this as my editor is set to remove trailing whitespace to appease jcheck (which doesn't look at markdown files). ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Thu Mar 10 01:31:14 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 01:31:14 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v2] In-Reply-To: References: <3-Pvy2aLpKxsfbx5NNDrwExhNdtC8-fb2LXqlgLRTzs=.e9790fcd-d5f1-45d5-8d29-b49036b45d3b@github.com> Message-ID: On Wed, 9 Mar 2022 20:09:57 GMT, Kim Barrett wrote: >> First, it's not exactly a new sentence, just one moved from elsewhere in our code base (from a file that was deleted in the companion PR to this one). >> >> Second, it is true; we have seen problems in the (distant) past of exactly the form claimed. The problem is that HotSpot is an irregular user of C++, including via assembly code and tortuous stack frame manipulation (deopt handlers etc.). It's easy to accidentally emit a use of of global `op new` through ten layers of C++ header file, and in HotSpot it's also easy to break the careful matching of constructors to destructors that C++ relies on. The result is a storage leak. >> >> Kim, I could see you thinking, also, that this sort of observation doesn't belong in a style guide, and a lot of these nuggets might tend to bloat which obscures the useful parts of the style guide. (An over-long guide is not a useful guide after all.) You might suggest where this rationale information goes, if not here. But I think it fits well enough here. And if it isn't inserted here, or some other new place, it will be lost because of David's file deletion in the other PR related to this one. I don't want it to get lost. > > If that sentence (or something like it) were at the end of the following "rationale" paragraph (where I think @rose00 originally suggested) then I'd be okay with it, as that would put it in the scope of the NMT requirement. Maybe even remove "typically" in that location. Moved to the rationale as suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Thu Mar 10 02:57:37 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 02:57:37 GMT Subject: RFR: 8282881: Print exception message in VM crash with -XX:AbortVMOnException In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 16:04:47 GMT, Emanuel Peter wrote: > In `Exceptions::debug_check_abort`, we crash the VM if the exception matches with `-XX:AbortVMOnException`. For example `-XX:AbortVMOnException=java.lang.RuntimeEx`. > > Currently, in the VM crash description, we only print the exception name (`value_string`), and not its message (`message`). For completeness and consistency, we should also print the exception message. > > I tested it with these two exceptions, the first results in `message` being `NULL`: > `throw new RuntimeException();` > `throw new RuntimeException("some message");` > > Running tests to make sure nothing else broke. Hi Emanuel, This seems fine. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7762 From jiefu at openjdk.java.net Thu Mar 10 04:04:38 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 10 Mar 2022 04:04:38 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Wed, 9 Mar 2022 12:33:49 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. > Please help review. I would suggest adding a jtreg test for this fix if it is possible. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From dholmes at openjdk.java.net Thu Mar 10 04:40:46 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 04:40:46 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Wed, 9 Mar 2022 15:28:29 GMT, Yi Yang wrote: > Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. Hi Yi, I agree these all look like they are completely safe to be LEAF functions. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7760 From jzhu at openjdk.java.net Thu Mar 10 05:54:40 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Thu, 10 Mar 2022 05:54:40 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Thu, 10 Mar 2022 04:01:39 GMT, Jie Fu wrote: > I would suggest adding a jtreg test for this fix if it is possible. Thanks. @DamonFool thanks for your suggestion. IMO a benchmark would be more suitable for this case. In fact, besides this fix, there exists another issue that will also affect the performance since delay vector inlining. After my initial triage, it may relate to ConstraintCastNode's dependency. I will add a benchmark after figuring out the solution to it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From stuefe at openjdk.java.net Thu Mar 10 05:55:39 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Mar 2022 05:55:39 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <7MNCDtaPTnKWN2k1DfuEUf8RIcQza2I2Q4IUa7Nm8Qs=.cb1605ee-2fab-4a49-b726-9bcf2b62c76a@github.com> On Wed, 9 Mar 2022 19:03:16 GMT, Anton Kozlov wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> current_thread_wx -> ThreadWX > > https://github.com/openjdk/jdk/compare/master...478ec1a7ca2c72e5916b28613a4875aa2ee1a793 touches more places than a targeted change in ThreadWXEnable... I'm not sure the real nesting is required for a thread that is not registered properly with VM. The initial state is always assumed for the NULL Thread. The SafeFetch assembly does not do up-calls to VM. I don't see why we'd need runtime tracking of WX state. The state is either WXExec for SafeFetch assembly, or unknown -- which we assume to be WXWrite regardless of approach taken. > > Nesting was implemented to reduce the amount of changes in JVM (yes, WX code scattered around the VM less than it could be :)), but it is possible to avoid runtime WX tracking if you always know the state, like we do if Thread == NULL. @AntonKozlov can you give us please a bit more background about the wx state stuff? - Why don't we just switch it on once, for a thread that conceivably may call into generated code, and be done with? Why is this fine granular switching even needed? I find it difficult to imagine an attack vector that exploits having this always enabled for a thread. After all, we have to mark code cache with MAP_JIT already, so it is not like we can execute arbitrary memory ranges. - Related to that, how much does this call cost? Is this a runtime call into the pthread library or does it get inlined somehow? Because things like SafeFetch are trimmed to be super cheap if the memory can be accessed. Doing a pthread call on every invocation may throw off the cost benefit ratio. - Why and where do we need nesting? This would be so much easier if we could just not care. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Thu Mar 10 06:04:42 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Mar 2022 06:04:42 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 19:03:16 GMT, Anton Kozlov wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> current_thread_wx -> ThreadWX > > https://github.com/openjdk/jdk/compare/master...478ec1a7ca2c72e5916b28613a4875aa2ee1a793 touches more places than a targeted change in ThreadWXEnable... I'm not sure the real nesting is required for a thread that is not registered properly with VM. The initial state is always assumed for the NULL Thread. The SafeFetch assembly does not do up-calls to VM. I don't see why we'd need runtime tracking of WX state. The state is either WXExec for SafeFetch assembly, or unknown -- which we assume to be WXWrite regardless of approach taken. > > Nesting was implemented to reduce the amount of changes in JVM (yes, WX code scattered around the VM less than it could be :)), but it is possible to avoid runtime WX tracking if you always know the state, like we do if Thread == NULL. Hi @AntonKozlov, > [master...478ec1a](https://github.com/openjdk/jdk/compare/master...478ec1a7ca2c72e5916b28613a4875aa2ee1a793) touches more places than a targeted change in ThreadWXEnable... I'm not sure the real nesting is required for a thread that is not registered properly with VM. Arguably we should restore, upon leaving the VM, the state that has been present before. Because a native thread may already have modified the wx state and overruling it is just rude. But I offhand don't see a way to do this since we have no way (?) to query the current state. > The change proposes to assume WXWrite as the initial state. Have you considered to extend ThreadWXEnable to fix the assert failure? Something like below (I have not tried to compile though). The refactoring looks OK, but it makes sense to separate it from functional change. > > ``` > class ThreadWXEnable { > Thread* _thread; > WXMode _old_mode; > > public: > ThreadWXEnable(WXMode new_mode, Thread* thread) : > _thread(thread) > { > if (_thread) { > _old_mode = _thread->enable_wx(new_mode); > } else { > os::current_thread_enable_wx(new_mode); > _old_mode = WXWrite; > } > } > ~ThreadWXEnable() { > if (_thread) { > _thread->enable_wx(_old_mode); > } else { > os::current_thread_enable_wx(_old_mode); > } > } > }; > ``` I honestly don't find this easier than the solution @parttimenerd proposes, using a OS thread local. Using an OS thread local makes this whole system independent from Thread, so you don't need to know about Thread and don't rely on Thread::current being present. It also would be slightly faster. Using Thread, we'd access TLS to get Thread::current, then dereference that to read the wx state . OTOH using OS TLS, we access TLS to get the wx state directly. We save one dereference. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jiefu at openjdk.java.net Thu Mar 10 06:07:37 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 10 Mar 2022 06:07:37 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: <4Zz6_q1_EbNU71ku9_ZJTJH4MSHeZgkOhw3Spr1tc4w=.db6d2e30-df53-4fda-8236-6505824e281c@github.com> On Thu, 10 Mar 2022 05:50:59 GMT, Joshua Zhu wrote: > IMO a benchmark would be more suitable for this case. To avoid breaking this fix again in the future, I would prefer a jtreg test since the jtreg tests are tested much more frequently and widely. TBH, I still don't know how to reproduce the problem and verify the fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From akozlov at openjdk.java.net Thu Mar 10 07:03:41 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 10 Mar 2022 07:03:41 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <7MNCDtaPTnKWN2k1DfuEUf8RIcQza2I2Q4IUa7Nm8Qs=.cb1605ee-2fab-4a49-b726-9bcf2b62c76a@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7MNCDtaPTnKWN2k1DfuEUf8RIcQza2I2Q4IUa7Nm8Qs=.cb1605ee-2fab-4a49-b726-9bcf2b62c76a@github.com> Message-ID: On Thu, 10 Mar 2022 05:52:48 GMT, Thomas Stuefe wrote: > Why don't we just switch it on once, for a thread that conceivably may call into generated code, and be done with? Why is this fine granular switching even needed? I find it difficult to imagine an attack vector that exploits having this always enabled for a thread. After all, we have to mark code cache with MAP_JIT already, so it is not like we can execute arbitrary memory ranges. A java thread executes the code (interpreter, JIT) and changes the code (e.g. it could make a nmethod non entrant, change inline cache). Code modifications are done in VM (runtime) call. So WX state is tied to java thread state. The WX management is done more to satisfy the platform requirement, than to make the system more secure. > Related to that, how much does this call cost? Is this a runtime call into the pthread library or does it get inlined somehow? Because things like SafeFetch are trimmed to be super cheap if the memory can be accessed. Doing a pthread call on every invocation may throw off the cost benefit ratio. SafeFetch is much more expensive compared the direct memory access. So I assume it's used only when the real chance exists the access may fail, and the average cost of SafeFetch is much higher than the best case. Yes, WX management is offered via a pthread function call. I haven't investigated if the native compiler can inline the call. The WX management itself considerably cheap https://github.com/openjdk/jdk/pull/2200#issuecomment-773382787. > Why and where do we need nesting? This would be so much easier if we could just not care. We swtich the state to WXWrite at the entry in VM call, but a VM call may do another VM call. E.g. a runtime VM calls the JNI GetLongField. So GetLongField could be called from a code executing in Java (WXExec) and VM (WXWrite) states, the WX state should be restored back on leaving JNI function. The original state is tracked in Thread. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From akozlov at openjdk.java.net Thu Mar 10 07:12:40 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 10 Mar 2022 07:12:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 20:25:02 GMT, Johannes Bechberger wrote: > Interesting. But I would nonetheless create an assertion that checks that there is no nesting in the case without a Thread object. I would this using a thread local nesting counter in the ThreadWXEnable class (incremented in the constructor and decremented in the destructor). Hmm, yes. And the assert would need to check the native thread still don't have a Thread object, i.e. the either Thread or TLS is used for tracking. This looks rather complicated, I agree. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From akozlov at openjdk.java.net Thu Mar 10 07:34:40 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 10 Mar 2022 07:34:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Thu, 10 Mar 2022 06:00:42 GMT, Thomas Stuefe wrote: > Arguably we should restore, upon leaving the VM, the state that has been present before. Because a native thread may already have modified the wx state and overruling it is just rude. But I offhand don't see a way to do this since we have no way (?) to query the current state. How in general safe to call SafeFetch on a native thread that has no Thread object? The JVM has not initialized the thread, so there could be no JVM signal handler installed. Or using libjsig is mandatory for this case? I also don't know a good way to query the WX state. > It also would be slightly faster. Using Thread, we'd access TLS to get Thread::current, then dereference that to read the wx state . OTOH using OS TLS, we access TLS to get the wx state directly. We save one dereference. If we compare approaches in general (not only SafeFetch), the Thread is already in hands, so we should to compare a read of the field from a C++ object, and the read of a TLS variable. The former could not be slower than the latter. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From akozlov at openjdk.java.net Thu Mar 10 07:42:39 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 10 Mar 2022 07:42:39 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <27NDx3zxCmg_D2R2B3WqrsNRE3UXaO4co3rR5faXLyw=.dea1b127-084c-4bb2-b227-69e3881e6b89@github.com> On Thu, 10 Mar 2022 07:09:27 GMT, Anton Kozlov wrote: > > Interesting. But I would nonetheless create an assertion that checks that there is no nesting in the case without a Thread object. I would this using a thread local nesting counter in the ThreadWXEnable class (incremented in the constructor and decremented in the destructor). > > Hmm, yes. And the assert would need to check the native thread still don't have a Thread object, i.e. the either Thread or TLS is used for tracking. This looks rather complicated, I agree. Is it possible to change SafeFetch only? Switch to WXExec before calling the stub and switch WXWrite back unconditionally? We won't need to provide assert in ThreadWXEnable. But SafeFetch can check the assumption with assert via Thread, if it exists. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From yyang at openjdk.java.net Thu Mar 10 09:04:43 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 10 Mar 2022 09:04:43 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Thu, 10 Mar 2022 04:37:11 GMT, David Holmes wrote: > Hi Yi, > > I agree these all look like they are completely safe to be LEAF functions. > > Thanks, David Thanks for your review, David! ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From shade at openjdk.java.net Thu Mar 10 09:21:44 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 10 Mar 2022 09:21:44 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Wed, 9 Mar 2022 15:28:29 GMT, Yi Yang wrote: > Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. I am a bit concerned about `JVM_Yield`, though. AFAICS, `JVM_LEAF` should not run for too long, because that would interfere with safepointing, and yielding the thread does look like an invitation to de-schedule the thread for unknown amount of time. @dholmes-ora -- you looked at it, have you figured this is okay? ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From yyang at openjdk.java.net Thu Mar 10 09:59:42 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 10 Mar 2022 09:59:42 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Thu, 10 Mar 2022 04:37:11 GMT, David Holmes wrote: >> Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. > > Hi Yi, > > I agree these all look like they are completely safe to be LEAF functions. > > Thanks, > David > I am a bit concerned about `JVM_Yield`, though. AFAICS, `JVM_LEAF` should not run for too long, because that would interfere with safepointing, and yielding the thread does look like an invitation to de-schedule the thread for unknown amount of time. @dholmes-ora -- you looked at it, have you figured this is okay? As I understand, JVM_LEAF could run long time, for operations that simply enought/taking long time but safe enough, we don't need to check safepoint(and further block itself), we could use JVM_LEAF, these entries work as if they are part of native code, no matter how long they take time to execute. Otherwise, we need JVM_ENTRY to check safepoint and block itself if sfpt requested. https://github.com/openjdk/jdk/blob/6a3a7b94a4c342ce12ad553f1ba2818ca3a77f36/src/hotspot/share/prims/unsafe.cpp#L393-L411 ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From tschatzl at openjdk.java.net Thu Mar 10 10:18:58 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 10 Mar 2022 10:18:58 GMT Subject: RFR: 8282893: Remove MacroAssembler::push/pop_callee_saved_registers Message-ID: Hi all, can I have reviews for this trivial removal of some dead code. Testing: gha builds, internal x86 builds Thanks, Thomas ------------- Commit messages: - Initial commit Changes: https://git.openjdk.java.net/jdk/pull/7772/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7772&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282893 Stats: 19 lines in 2 files changed: 0 ins; 19 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7772.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7772/head:pull/7772 PR: https://git.openjdk.java.net/jdk/pull/7772 From dholmes at openjdk.java.net Thu Mar 10 10:20:42 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 10:20:42 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Thu, 10 Mar 2022 07:31:47 GMT, Anton Kozlov wrote: > How in general safe to call SafeFetch on a native thread that has no Thread object? The JVM has not initialized the thread, so there could be no JVM signal handler installed. @AntonKozlov Signal handlers are per-process not per-thread, so a thread need not be attached to the VM for our signal handlers to get involved - that is why they call `Thread::current_or_null_safe()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From redestad at openjdk.java.net Thu Mar 10 10:24:37 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 10 Mar 2022 10:24:37 GMT Subject: RFR: 8282893: Remove MacroAssembler::push/pop_callee_saved_registers In-Reply-To: References: Message-ID: <0kUcEk72L_mnPDtXSVEseeoltX7_jZuGVuoc5GdfCeA=.6dc521f4-5138-471f-b31c-617445ecada4@github.com> On Thu, 10 Mar 2022 10:11:16 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this trivial removal of some dead code. > > Testing: gha builds, internal x86 builds > > Thanks, > Thomas Marked as reviewed by redestad (Reviewer). Looks good and trivial. ------------- PR: https://git.openjdk.java.net/jdk/pull/7772 From ihse at openjdk.java.net Thu Mar 10 11:23:44 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 10 Mar 2022 11:23:44 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 07:58:51 GMT, Thomas Stuefe wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > Hi Ioi, > > some questions, comments inline. > > Like David in the comments, I am also a bit vague on the usefulness, but I may not know the whole story. Is it to enable repackagers like Debian to check the "reproducable" tickbox on their OpenJDK package? Or is there a practical need for this? > > Thanks, Thomas @tstuefe Without commenting on Ioi's actual implementation, let me explain a bit on the importance of this fix. Reproducible builds is not just a "checkbox", any more than "does not crash on startup" is a checkbox. It is an important security tool. See e.g. https://reproducible-builds.org/ for more information. The problem with CDS generating non-deterministic output is that during the build process we generate the file classes.jsa (and classes_nocoops.jsa). These files in turn are included in the java.base jmod, which in turn is included in the entire jlinked image. So if classes.jsa gets random bits, these random bits propagate to java.base.jmod and finally, to the entire jimage. This means that it is imposslbe to get bit-by-bit reproducibility verification of the entire JDK build. For several years, we have relentlessly (albeit with an unfortunately low priority) addressed and fixed indeterminism in the build of the JDK. We are now at the point were the only major issue is the randomness of classes.jsa and classes_nocoops.jsa. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From tschatzl at openjdk.java.net Thu Mar 10 11:31:42 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 10 Mar 2022 11:31:42 GMT Subject: RFR: 8282893: Remove MacroAssembler::push/pop_callee_saved_registers In-Reply-To: <0kUcEk72L_mnPDtXSVEseeoltX7_jZuGVuoc5GdfCeA=.6dc521f4-5138-471f-b31c-617445ecada4@github.com> References: <0kUcEk72L_mnPDtXSVEseeoltX7_jZuGVuoc5GdfCeA=.6dc521f4-5138-471f-b31c-617445ecada4@github.com> Message-ID: On Thu, 10 Mar 2022 10:21:53 GMT, Claes Redestad wrote: >> Hi all, >> >> can I have reviews for this trivial removal of some dead code. >> >> Testing: gha builds, internal x86 builds >> >> Thanks, >> Thomas > > Looks good and trivial. Thanks @cl4es for your review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7772 From tschatzl at openjdk.java.net Thu Mar 10 11:31:42 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 10 Mar 2022 11:31:42 GMT Subject: Integrated: 8282893: Remove MacroAssembler::push/pop_callee_saved_registers In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 10:11:16 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this trivial removal of some dead code. > > Testing: gha builds, internal x86 builds > > Thanks, > Thomas This pull request has now been integrated. Changeset: 83d77186 Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/83d771869046c2a2bf251ee5aebaceba60555e65 Stats: 19 lines in 2 files changed: 0 ins; 19 del; 0 mod 8282893: Remove MacroAssembler::push/pop_callee_saved_registers Reviewed-by: redestad ------------- PR: https://git.openjdk.java.net/jdk/pull/7772 From ihse at openjdk.java.net Thu Mar 10 12:14:44 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 10 Mar 2022 12:14:44 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 11:45:59 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > The "heap dump" aspect of this is not something I'm familiar with, but if the threads don't affect the list of classes dumped, they surely must affect what is in the heap dump otherwise their execution would not be an issue. So you must be sacrificing something by not having these threads start. @dholmes-ora That something is "sacrificed" does not follow from that something is "different". The list of classes to dump is specified in the lib/classlist file, which is generated during the build. The process of creating this involves running a suitable "exercise most important parts" java program, and logging the classes loaded. This class file is then post-processed (sorted) to make sure it is reproducible for the same JDK code base. As Ioi say, in this case threads are started freely, and may run in any non-deterministic order. At the next stage, we take this file (which is just done implicitly by -Xshare:dump), and generate the actual CDS archive, classes.jsa. Now it turns out this generation is non-deterministic. And Ioi's analysis is that this is due to thread non-determinism. So if we just disable threads during the dump process (where we are not really running the JVM "actually" -- it's a special mode, where we don't even have a Java program to run!), there's no harm in that. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From roland at openjdk.java.net Thu Mar 10 12:30:53 2022 From: roland at openjdk.java.net (Roland Westrelin) Date: Thu, 10 Mar 2022 12:30:53 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Mon, 21 Feb 2022 06:19:26 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into postloop > > Change-Id: I503edb75f0f626569c776416bfef09651935979c > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Aren't those 2 issues: > * **[Issue-3] Incorrect result in manually unrolled loops** > > > This issue can be reproduced by below manually unrolled loop. > > ``` > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > ``` > * **[Issue-5] Incorrect result in loops with potential data dependence** > > > This issue can be reproduced by below corner case on AArch64 only. > > ``` > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > ``` fairly similar. Doesn't/couldn't the logic from issue 5 protect from issue 3? I see no reason not to proceed with this change as it attempts to fix something that's broken and nobody uses, doesn't enable it by default and doesn't affect code other than post loop vectorization. I haven't looked at tests closely but I don't think there are IR tests. Maybe you want to consider adding some as a follow up to this PR. Another review (ideally by someone familiar with the superword code) is required. src/hotspot/share/opto/loopnode.cpp line 4443: > 4441: CountedLoopNode *cl = lpt->_head->as_CountedLoop(); > 4442: > 4443: if (cl->is_rce_post_loop() && !cl->is_vectorized_loop()) { Maybe assert that PostLoopMultiversioning is true? src/hotspot/share/opto/superword.cpp line 114: > 112: if (post_loop_allowed) { > 113: if (cl->is_reduction_loop()) return; // no predication mapping > 114: Node *limit = cl->limit(); Why was this required but no longer is? src/hotspot/share/opto/superword.hpp line 616: > 614: //------------------------------SWPointer--------------------------- > 615: // Information about an address for dependence checking and vector alignment > 616: class SWPointer : public ResourceObj { Why is this required? ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From akozlov at openjdk.java.net Thu Mar 10 12:34:43 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 10 Mar 2022 12:34:43 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Thu, 10 Mar 2022 10:17:31 GMT, David Holmes wrote: > Signal handlers are per-process not per-thread, so a thread need not be attached to the VM for our signal handlers to get involved - that is why they call Thread::current_or_null_safe(). Oh, right, thanks. I was too concentrated on thinking another platforms like windows, that missed the example will work for *nix. A general question to the bug. The signal mask is per-thread, and a native thread may block the JVM signal. I think safefetch will fail, if somehow we manage to call it on this thread (without jsig). So I'm not sure the safefetch is really safe on all platforms and in all contexts, that is, it always recovers from the read failure if called on a random thread. Is there a crash that is fixed by the change? I just spotted it is an enhancement, not a bug. Just trying to understand the problem. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Thu Mar 10 12:46:49 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Mar 2022 12:46:49 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX Hi Anton, thanks a lot for your explanations. You made some things clearer to me. My answers are inline. > > Why don't we just switch it on once, for a thread that conceivably may call into generated code, and be done with? Why is this fine granular switching even needed? I find it difficult to imagine an attack vector that exploits having this always enabled for a thread. After all, we have to mark code cache with MAP_JIT already, so it is not like we can execute arbitrary memory ranges. > > A java thread executes the code (interpreter, JIT) and changes the code (e.g. it could make a nmethod nonentrant, change inline cache). Code modifications are done in VM (runtime) call. So WX state is tied to the java thread state. The WX management is done more to satisfy the platform requirement than to make the system more secure. Okay, that was the piece I was missing. I was assuming that we have either executing or modifying threads and that a thread was either one or the other (compiler threads compile, java threads run). You are saying that Java Threads may write too. And CompilerThreads, in the case of SafeFetch at least, may run generated code. So switching has to be done as a stack mark. Have I gotten this right? > > > Related to that, how much does this call cost? Is this a runtime call into the pthread library or does it get inlined somehow? Because things like SafeFetch are trimmed to be super cheap if the memory can be accessed. Doing a pthread call on every invocation may throw off the cost-benefit ratio. > > SafeFetch is much more expensive compared the direct memory access. So I assume it's used only when the real chance exists the access may fail, and the average cost of SafeFetch is much higher than the best case. Yes, we only do this when necessary, but it is supposed to be reasonably cheap if memory is accessible. Its Load (the safefetch blob) -> unconditional jump to the blob -> load target memory -> jump back. Depending on what the pthread library call does, and if it's a real function call into a library, it would be more expensive than that. > > Yes, WX management is offered via a pthread function call. I haven't investigated if the native compiler can inline the call. The WX management itself considerably cheap [#2200 (comment)](https://github.com/openjdk/jdk/pull/2200#issuecomment-773382787). > > > Why and where do we need nesting? This would be so much easier if we could just not care. > > We swtich the state to WXWrite at the entry in VM call, but a VM call may do another VM call. E.g. a runtime VM calls the JNI GetLongField. So GetLongField could be called from a code executing in Java (WXExec) and VM (WXWrite) states, the WX state should be restored back on leaving JNI function. The original state is tracked in Thread. Okay, thanks again for clarifying. > > Arguably we should restore, upon leaving the VM, the state that has been present before. Because a native thread may already have modified the wx state and overruling it is just rude. But I offhand don't see a way to do this since we have no way (?) to query the current state. > > How in general safe to call SafeFetch on a native thread that has no Thread object? SafeFetch is safe to call if the stub routine exists. So it is safe after VM initialization. And that can be tested too. Callers, when in doubt, are encouraged to use `CanUseSafeFetch` to check if VM is still in pre-initialization time. `CanUseSafeFetch` + `SafeFetch` should never crash, regardless of when and by whom it was called. We also have `os::is_readable_pointer()`, which wraps these two calls for convenience. > The JVM has not initialized the thread, so there could be no JVM signal handler installed. Or using libjsig is mandatory for this case? As David wrote, the Signal handler is per process. It is set as part of VM initialization before SafeFetch blob is generated. Foreign threads still enter the signal handler. So crashes in foreign threads still generate hs-err reports. Depending on how your support is organized that's either a bug or a feature :) > > I also don't know a good way to query the WX state. > > > It also would be slightly faster. Using Thread, we'd access TLS to get Thread::current, then dereference that to read the wx state . OTOH using OS TLS, we access TLS to get the wx state directly. We save one dereference. > > If we compare approaches in general (not only SafeFetch), the Thread is already in hands, so we should to compare a read of the field from a C++ object, and the read of a TLS variable. The former could not be slower than the latter. You lost me here. To me most of the invocations of `ThreadWXEnable` seem to use `Thread::current()`. Only those who retrieve the thread from the JNI environment don't. IIRC, TLS, at least on Linux, lives at the front of the thread stack, so accessing it should be quite cheap. I see the performance point of an option to pass in Thread* in case one already has it. I dislike it a bit because it gives the illusion that you could pass in arbitrary threads when in fact you must pass in Thread::current. But an assertion could help clarifying here. > > Is it possible to change SafeFetch only? Switch to WXExec before calling the stub and switch WXWrite back unconditionally? We won't need to provide assert in ThreadWXEnable. But SafeFetch can check the assumption with assert via Thread, if it exists. But SafeFetch could be used from outside code as well as VM code. In case of the latter, prior state can either be WXWrite or WXExec. It needs to restore the prior state after the call. --- To summarize the different proposals: - you propose to use Thread* when available and assume WXWrite as prior state when not. You argue that if there is no Thread::current, we are not a VM thread and we should need no nesting, so a simple switchback to wxwrite should suffice after leaving SafeFetch, right? - Johannes proposes to use TLS, and just always support nesting, regardless of who calls. What I like about Johannes proposal is that its simple. It has fewer dependencies on VM infrastructure and we can mostly just hide it in the platform tree. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Thu Mar 10 12:56:43 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 12:56:43 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 12:11:06 GMT, Magnus Ihse Bursie wrote: >> The "heap dump" aspect of this is not something I'm familiar with, but if the threads don't affect the list of classes dumped, they surely must affect what is in the heap dump otherwise their execution would not be an issue. So you must be sacrificing something by not having these threads start. > > @dholmes-ora That something is "sacrificed" does not follow from that something is "different". The list of classes to dump is specified in the lib/classlist file, which is generated during the build. > > The process of creating this involves running a suitable "exercise most important parts" java program, and logging the classes loaded. This class file is then post-processed (sorted) to make sure it is reproducible for the same JDK code base. > > As Ioi say, in this case threads are started freely, and may run in any non-deterministic order. > > At the next stage, we take this file (which is just done implicitly by -Xshare:dump), and generate the actual CDS archive, classes.jsa. Now it turns out this generation is non-deterministic. And Ioi's analysis is that this is due to thread non-determinism. So if we just disable threads during the dump process (where we are not really running the JVM "actually" -- it's a special mode, where we don't even have a Java program to run!), there's no harm in that. @magicus the issue is not the list of classes dumped, or their format in the dump. As Ioi indicated that list is fixed. The issue is with the heap dump part of the archive. Running these other threads affects the heap so by not running them with end up with a different heap. So the question is whether there is anything about having a different heap dumped that we need to be concerned about. We dump the heap into the archive for a reason and this changes what we dump, ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From ihse at openjdk.java.net Thu Mar 10 12:56:44 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 10 Mar 2022 12:56:44 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: <_gEQbuQSg9m85mgGK8CTapGQ8lHNcaZk6swKv_h3b2c=.22aee41e-3faa-432f-aa9b-e1b21f40a63c@github.com> On Wed, 9 Mar 2022 05:10:44 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Fixed zero build Well, previously we'd get different dumps on different runs. If that was an issue, surely it would have manifested itself by now? With this change, we'll just get the same dump each run. I fail to see how that could be a risk. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Thu Mar 10 12:56:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Mar 2022 12:56:44 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: <3rBXZ8EImtvDdKrC0ORD6PLNvXA46aPXo4l8hKiRUkQ=.461eac24-d5e3-483a-a567-1ed22aa1dade@github.com> On Wed, 9 Mar 2022 07:58:51 GMT, Thomas Stuefe wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > Hi Ioi, > > some questions, comments inline. > > Like David in the comments, I am also a bit vague on the usefulness, but I may not know the whole story. Is it to enable repackagers like Debian to check the "reproducable" tickbox on their OpenJDK package? Or is there a practical need for this? > > Thanks, Thomas > @tstuefe Without commenting on Ioi's actual implementation, let me explain a bit on the importance of this fix. > > Reproducible builds is not just a "checkbox", any more than "does not crash on startup" is a checkbox. It is an important security tool. See e.g. https://reproducible-builds.org/ for more information. > Hi @magicus, thanks for explaining, and for the link. That one was a good explanation. I had no idea, but I'm convinced now. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From magnus.ihse.bursie at oracle.com Thu Mar 10 12:58:58 2022 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 10 Mar 2022 13:58:58 +0100 Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: <_gEQbuQSg9m85mgGK8CTapGQ8lHNcaZk6swKv_h3b2c=.22aee41e-3faa-432f-aa9b-e1b21f40a63c@github.com> References: <_gEQbuQSg9m85mgGK8CTapGQ8lHNcaZk6swKv_h3b2c=.22aee41e-3faa-432f-aa9b-e1b21f40a63c@github.com> Message-ID: <30b40101-1af6-8d6f-cbfe-758001d588a8@oracle.com> The Skara bots messed up this one badly. It was a reply to David's comment, not Ioi's latest push. /Magnus On 2022-03-10 13:56, Magnus Ihse Bursie wrote: > On Wed, 9 Mar 2022 05:10:44 GMT, Ioi Lam wrote: > >>> This patch makes the result of "java -Xshare:dump" deterministic: >>> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >>> - Fixed a problem in hashtable ordering in heapShared.cpp >>> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >>> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >>> >>> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >>> >>> Testing under way: >>> - tier1~tier5 >>> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > Well, previously we'd get different dumps on different runs. If that was an issue, surely it would have manifested itself by now? With this change, we'll just get the same dump each run. I fail to see how that could be a risk. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7748 From dholmes at openjdk.java.net Thu Mar 10 13:21:43 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 13:21:43 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Thu, 10 Mar 2022 09:18:49 GMT, Aleksey Shipilev wrote: >> Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. > > I am a bit concerned about `JVM_Yield`, though. AFAICS, `JVM_LEAF` should not run for too long, because that would interfere with safepointing, and yielding the thread does look like an invitation to de-schedule the thread for unknown amount of time. @dholmes-ora -- you looked at it, have you figured this is okay? @shipilev JVM_LEAF keeps the thread `_thread_in_native` and so has no impact on safepoints etc (in contrast JRT_LEAF can be called while `_thread_in_java` which does impact safepointing). So your concern about yield is actually the wrong way round: the current JVM_ENTRY means we yield while `_thread_in_vm` which means that a safepoint cannot be reached until that thread gets rescheduled. ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From shade at openjdk.java.net Thu Mar 10 13:25:54 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 10 Mar 2022 13:25:54 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Thu, 10 Mar 2022 09:18:49 GMT, Aleksey Shipilev wrote: >> Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. > > I am a bit concerned about `JVM_Yield`, though. AFAICS, `JVM_LEAF` should not run for too long, because that would interfere with safepointing, and yielding the thread does look like an invitation to de-schedule the thread for unknown amount of time. @dholmes-ora -- you looked at it, have you figured this is okay? > @shipilev JVM_LEAF keeps the thread `_thread_in_native` and so has no impact on safepoints etc (in contrast JRT_LEAF can be called while `_thread_in_java` which does impact safepointing). So your concern about yield is actually the wrong way round: the current JVM_ENTRY means we yield while `_thread_in_vm` which means that a safepoint cannot be reached until that thread gets rescheduled. Ah, yes. I confused myself! ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From shade at openjdk.java.net Thu Mar 10 13:25:54 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 10 Mar 2022 13:25:54 GMT Subject: RFR: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Wed, 9 Mar 2022 15:28:29 GMT, Yi Yang wrote: > Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From dholmes at openjdk.java.net Thu Mar 10 13:42:40 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 13:42:40 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: <_gEQbuQSg9m85mgGK8CTapGQ8lHNcaZk6swKv_h3b2c=.22aee41e-3faa-432f-aa9b-e1b21f40a63c@github.com> References: <_gEQbuQSg9m85mgGK8CTapGQ8lHNcaZk6swKv_h3b2c=.22aee41e-3faa-432f-aa9b-e1b21f40a63c@github.com> Message-ID: <5aktAeTElag7Ggnr0HIfRVDSf6NQtaEpjQLjqeyBIeg=.62824610-1602-4853-91d2-beb28e18faa4@github.com> On Thu, 10 Mar 2022 12:50:58 GMT, Magnus Ihse Bursie wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > Well, previously we'd get different dumps on different runs. If that was an issue, surely it would have manifested itself by now? With this change, we'll just get the same dump each run. I fail to see how that could be a risk. @magicus I think we need @iklam to weigh in here and explain exactly what the "heap dump" consists of and how not running those threads affects its contents. Presently the heap dump is potentially different on each run, IIUC, only due to the order of its contents, not the contents themselves. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From ihse at openjdk.java.net Thu Mar 10 13:54:48 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 10 Mar 2022 13:54:48 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 05:10:44 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Fixed zero build I think he already did. I'm quoting: > However, the CDS archive also contains a heap dump, which includes Java HashMaps. If I allow those 3 Java threads to start, some HashMaps in the module graph will have unstable ordering. I think the reason is concurrent thread execution causes unstable assignment of the identity_hash for objects in the heap dump. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From aph at openjdk.java.net Thu Mar 10 14:05:49 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 10 Mar 2022 14:05:49 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v12] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 11:38:34 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - 8279508: Removing +LogCompilation flag. > - 8279508: Review comments resolved.` > - 8279508: Adding descriptive comments. > - 8279508: Review comments resolved. > - 8279508: Review comments resolved. > - 8279508: Fixing for windows failure. > - 8279508: Adding few descriptive comments. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - ... and 5 more: https://git.openjdk.java.net/jdk/compare/d07f7c76...547f4e31 test/micro/org/openjdk/bench/java/math/FpRoundingBenchmark.java line 114: > 112: for (int i = 0; i < TESTSIZE; i++) { > 113: ResF[i] = Math.round(FargV1[i]); > 114: } I think that this is wrong: you should not be storing the result into a float array because it requires an extra integer->float conversion, which distorts the timings. You need `resI` and `resL` for the results of `Math.round`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From aph at openjdk.java.net Thu Mar 10 14:10:52 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 10 Mar 2022 14:10:52 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v12] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 11:38:34 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - 8279508: Removing +LogCompilation flag. > - 8279508: Review comments resolved.` > - 8279508: Adding descriptive comments. > - 8279508: Review comments resolved. > - 8279508: Review comments resolved. > - 8279508: Fixing for windows failure. > - 8279508: Adding few descriptive comments. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - ... and 5 more: https://git.openjdk.java.net/jdk/compare/d07f7c76...547f4e31 test/micro/org/openjdk/bench/java/math/FpRoundingBenchmark.java line 70: > 68: } > 69: > 70: FargV1 = new float[TESTSIZE]; `FargV1` is not initialized. You need to set `i = 0;` here. test/micro/org/openjdk/bench/java/math/FpRoundingBenchmark.java line 78: > 76: > 77: for (; i < TESTSIZE; i++) { > 78: FargV1[i] = r.nextFloat()*TESTSIZE; This is an unrealistically narrow range of values. I'd use Suggestion: FargV1[i] = Float.intBitsToFloat(r.nextInt()); ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dholmes at openjdk.java.net Thu Mar 10 14:12:51 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 14:12:51 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Thu, 10 Mar 2022 12:31:06 GMT, Anton Kozlov wrote: > The signal mask is per-thread, and a native thread may block the JVM signal. @AntonKozlov the signal from safefetch (if it is not safe) is a SIGSEGV or SIGBUS. If these signals happen to be blocked and we raise the signal synchronously then we are in undefined behaviour territory. So I guess in that sense yes safefetch is not guaranteed to be safe. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From akozlov at openjdk.java.net Thu Mar 10 14:12:51 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 10 Mar 2022 14:12:51 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Thu, 10 Mar 2022 12:41:11 GMT, Thomas Stuefe wrote: > You are saying that Java Threads may write too. And CompilerThreads, in the case of SafeFetch at least, may run generated code. So switching has to be done as a stack mark. Have I gotten this right? Right. > Depending on what the pthread library call does, and if it's a real function call into a library, it would be more expensive than that. Yes, unfortunately we need something like this. > > If we compare approaches in general (not only SafeFetch), the Thread is already in hands, so we should to compare a read of the field from a C++ object, and the read of a TLS variable. The former could not be slower than the latter. > > To me most of the invocations of `ThreadWXEnable` seem to use `Thread::current()`. Only those who retrieve the thread from the JNI environment don't. Right, JNI env is used e.g. in interfaceSupport.hpp where the most VM entries are defined. I found only few instances of ThreadWXEnable to receive Thread::current() as the argument immediately. In the rest, the Thread is there somewhere in the context. > > IIRC, TLS, at least on Linux, lives at the front of the thread stack, so accessing it should be quite cheap. > > I see the performance point of an option to pass in Thread* in case one already has it. I dislike it a bit because it gives the illusion that you could pass in arbitrary threads when in fact you must pass in Thread::current. But an assertion could help clarifying here. There is the assert in Thread::enable_wx, where the implementation actually is unable to handle anything except the current threads. > > Is it possible to change SafeFetch only? Switch to WXExec before calling the stub and switch WXWrite back unconditionally? We won't need to provide assert in ThreadWXEnable. But SafeFetch can check the assumption with assert via Thread, if it exists. > > But SafeFetch could be used from outside code as well as VM code. In case of the latter, prior state can either be WXWrite or WXExec. It needs to restore the prior state after the call. I'm not sure I understand what is the "outside code". The SafeFetch is the private hotspot function, it cannot be linked with non-JVM code, isn't it? > To summarize the different proposals: > > * you propose to use Thread* when available and assume WXWrite as prior state when not. You argue that if there is no Thread::current, we are not a VM thread and we should need no nesting, so a simple switchback to wxwrite should suffice after leaving SafeFetch, right? So far I like the another approach more, that is, always assume WXWrite and use the Thread only to assert the state. But I did not understand your concern above. > * Johannes proposes to use TLS, and just always support nesting, regardless of who calls. > > What I like about Johannes proposal is that its simple. It has fewer dependencies on VM infrastructure and we can mostly just hide it in the platform tree. I'm not opposing the refactoring, it has some advantages, but I'd want to separate functional change from the refactoring. This would aid backporting, at least. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From darcy at openjdk.java.net Thu Mar 10 14:32:46 2022 From: darcy at openjdk.java.net (Joe Darcy) Date: Thu, 10 Mar 2022 14:32:46 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: <2jFjnftd7VluGsxgp8BK0vgHA68VrgGREj0fk7F6Dhk=.e40ddcaa-5a31-4115-976d-5f43e94b8ccf@github.com> References: <2jFjnftd7VluGsxgp8BK0vgHA68VrgGREj0fk7F6Dhk=.e40ddcaa-5a31-4115-976d-5f43e94b8ccf@github.com> Message-ID: <7lwsCvdSjkvDYJNwuA7fVPrWFUbzchuwx0Z3IID5VZw=.0c00c3d7-2106-40df-88bc-38bf7e2655f9@github.com> On Fri, 4 Mar 2022 19:04:40 GMT, Jatin Bhateja wrote: >> IMO RoundTests should have a explicit @run tag without any VM options as well. >> >> Do the added VM options run on all platforms in question? What is the approximate time to run the test run compared to before? > > Hi @jddarcy , > > Test has been modified on the same lines using generic options which manipulate compilation thresholds and agnostic to target platforms. > > * @run main/othervm -XX:Tier3CompileThreshold=100 -XX:CompileThresholdScaling=0.01 -XX:+TieredCompilation RoundTests > > Verified that RoundTests::test* methods gets compiled by c2. > Test execution time with and without change is almost same ~7.80sec over Skylake-server. > > Regards To be more explicit, the existing RoundTests.java test runs in a fraction of a second. The updated test runs many times slower, even if now under 10 second, at least on some platforms. Can something closer to the original performance be restored? As a tier 1 library test, these tests are run quite frequently. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From duke at openjdk.java.net Thu Mar 10 14:58:45 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 10 Mar 2022 14:58:45 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v4] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 14:13:38 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request incrementally with two additional commits since the last revision: > > - moving nops out of far_jump > - minor renaming Can we have tests for this? src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1078: > 1076: static bool is_trampoline_needed() { > 1077: return ReservedCodeCacheSize > branch_range; > 1078: } What about to keep `far_branches` and to use it in `is_trampoline_needed` and other places? ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From dcubed at openjdk.java.net Thu Mar 10 16:02:43 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Mar 2022 16:02:43 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v3] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 01:31:11 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Move new sentence to 'rationale' paragraph. Still thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From kbarrett at openjdk.java.net Thu Mar 10 16:14:57 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 10 Mar 2022 16:14:57 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v3] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 01:31:11 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Move new sentence to 'rationale' paragraph. doc/hotspot-style.md line 474: > 472: VM's Native Memory Tracking (NMT) feature. Typically, uses of the global > 473: operator new are inadvertent and therefore often associated with memory > 474: leaks. Thanks for moving this. doc/hotspot-style.md line 669: > 667: initialization. Dynamic initialization and destruction of > 668: namespace-scoped thread local variables also has the same ordering > 669: problems as for ordinary namespace-scoped variables. So we avoid use of Should be s/namespace-scoped/non-local/ (two places). This was overlooked by JDK-8272691. (Mea culpa.) And yes, "non-local thread local" (or "non-local thread-local") looks a little weird. The standard (almost) doesn't use "thread-local" as a term, instead talking about variables with "thread storage duration". doc/hotspot-style.md line 671: > 669: problems as for ordinary namespace-scoped variables. So we avoid use of > 670: `thread_local` in general, limiting its use to only those cases where dynamic > 671: initialization and destruction are essential. See Consider s/and/or/. ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From psandoz at openjdk.java.net Thu Mar 10 16:35:44 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 10 Mar 2022 16:35:44 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Wed, 9 Mar 2022 12:33:49 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. > Please help review. I think its OK, to follow up after this with some tests for "polluted" profiles of vectors (which may expose more issues). Given the scope of the fix i would recommend adding a comment in each place as to why we don't switch over the enum constant itself (note we are very careful in other performance critical areas of the JDK to avoid this e.g. in VarHandle code). ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From stuefe at openjdk.java.net Thu Mar 10 18:07:40 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Mar 2022 18:07:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <2VVnQ4RiNCtAuWXQ_d-vgj-8uejqKTdAWXwxKJUNix4=.6d88041c-2332-452d-9e70-b9429940d1f0@github.com> On Thu, 10 Mar 2022 12:31:06 GMT, Anton Kozlov wrote: > > Signal handlers are per-process not per-thread, so a thread need not be attached to the VM for our signal handlers to get involved - that is why they call Thread::current_or_null_safe(). > > Oh, right, thanks. I was too concentrated on thinking another platforms like windows, that missed the example will work for *nix. > > A general question to the bug. The signal mask is per-thread, and a native thread may block the JVM signal. I think safefetch will fail, if somehow we manage to call it on this thread (without jsig). So I'm not sure the safefetch is really safe on all platforms and in all contexts, that is, it always recovers from the read failure if called on a random thread. To expand on @dholmes-ora answer: blocking SIGSEGV and SIGBUS - or other synchronous error signals like SIGFPE - and then triggering said signal is UB. What happens is OS-dependent. I saw processes vanishing, or hang, or core. It makes sense, since what is the kernel supposed to do. It cannot deliver the signal, and deferring it would require returning to the faulting instruction, that would just re-fault. For some more details see e.g. https://bugs.openjdk.java.net/browse/JDK-8252533 > > Is there a crash that is fixed by the change? I just spotted it is an enhancement, not a bug. Just trying to understand the problem. Yes, this issue is a breakout from https://bugs.openjdk.java.net/browse/JDK-8282306, where we'd like to use SafeFetch to make stack walking in AsyncGetCallTrace more robust. AGCT is called from the signal handler, and it may run in any number of situations (e.g. in foreign threads, or threads which are in the process of getting dismantled, etc). Another situation is error handling itself. When writing an hs-err file, we use SafeFetch to do carefully tiptoe around the possibly corrupt VM state. If the original crash happened in a foreign thread, we still want some of these reports to work (e.g. dumping register content or printing stacks). So SafeFetch should be as robust as possible. > I'm not opposing the refactoring, it has some advantages, but I'd want to separate functional change from the refactoring. This would aid backporting, at least. I agree that a minimal patch would be good. I feel partly guilty for this expanding discussion. I'm fine with a minimal change, without any refactoring, in whatever form @parttimenerd choses - be it OS thread local or the approach proposed by you, @AntonKozlov. Cheers Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Thu Mar 10 18:10:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Mar 2022 18:10:44 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <9abtTyWumeHahJhxZnL_GX3s9_TdDAZ_e8b7OffYfoI=.c3b0b765-d9f4-4db3-bbd0-48c3598c7aa5@github.com> On Thu, 10 Mar 2022 14:09:24 GMT, Anton Kozlov wrote: > > > Is it possible to change SafeFetch only? Switch to WXExec before calling the stub and switch WXWrite back unconditionally? We won't need to provide assert in ThreadWXEnable. But SafeFetch can check the assumption with assert via Thread, if it exists. > > > > > > But SafeFetch could be used from outside code as well as VM code. In case of the latter, prior state can either be WXWrite or WXExec. It needs to restore the prior state after the call. > > I'm not sure I understand what is the "outside code". The SafeFetch is the private hotspot function, it cannot be linked with non-JVM code, isn't it? Sorry for being imprecise. I meant SafeFetch is triggered from within a signal handler that runs on a foreign thread. E.g. AGCT or error handling. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jbhateja at openjdk.java.net Thu Mar 10 18:21:31 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 10 Mar 2022 18:21:31 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v13] In-Reply-To: References: Message-ID: <4_HYBZmpNv3PRkPgYlo51m_YAyKeLB6m1hlg_jX9fMY=.69c82d73-0cd4-419c-96d0-99be51e58e15@github.com> > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Review comments resolution. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/547f4e31..fcb73212 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=11-12 Stats: 13 lines in 3 files changed: 6 ins; 3 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From iklam at openjdk.java.net Thu Mar 10 19:29:43 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 10 Mar 2022 19:29:43 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 13:51:56 GMT, Magnus Ihse Bursie wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > I think he already did. I'm quoting: > >> However, the CDS archive also contains a heap dump, which includes Java HashMaps. If I allow those 3 Java threads to start, some HashMaps in the module graph will have unstable ordering. I think the reason is concurrent thread execution causes unstable assignment of the identity_hash for objects in the heap dump. > @magicus I think we need @iklam to weigh in here and explain exactly what the "heap dump" consists of and how not running those threads affects its contents. Presently the heap dump is potentially different on each run, IIUC, only due to the order of its contents, not the contents themselves. CDS doesn't dump the entire Java heap. Instead, it dumps only a selected portion of the Java heap. For example, the module graph. The contents of the dumped objects are always the same, except that the identity hashcode may be different if multiple threads are executed. The identity hashcode is computed here, and its value is "sticky" to the first thread that tries to get the hashcode for an object. static inline intptr_t get_next_hash(Thread* current, oop obj) { intptr_t value = 0; if (hashCode == 0) { ... } else if (hashCode == 4) { ... } else { // default hashCode is 5: // Marsaglia's xor-shift scheme with thread-specific state // This is probably the best overall implementation -- we'll // likely make this the default in future releases. unsigned t = current->_hashStateX; t ^= (t << 11); current->_hashStateX = current->_hashStateY; current->_hashStateY = current->_hashStateZ; current->_hashStateZ = current->_hashStateW; unsigned v = current->_hashStateW; v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)); current->_hashStateW = v; value = v; } So, when the main Java thread tries to store an object `O` into a hashtable inside the module graph, if the hashcode of `O` has already been computed by a non-main thread, then the module graph will have unstable contents. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Thu Mar 10 19:37:40 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 10 Mar 2022 19:37:40 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: <1E_zgqBQtByT4cXyk_dlbXGRVAQpCI6jlKXFIYovvVU=.3597bbf1-2ae7-4d41-9bef-79f77c90e8d3@github.com> On Wed, 9 Mar 2022 07:51:46 GMT, Thomas Stuefe wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > src/hotspot/share/prims/jvm.cpp line 2887: > >> 2885: return; >> 2886: } >> 2887: #endif > > Should we do this for jni_AttachCurrentThread too? This hasn't been necessary for me because jni_AttachCurrentThread is not called during "java -Xshare:dump", which executes under a very strict condition and doesn't normally allow arbitrary JNI libraries to be loaded. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Thu Mar 10 19:44:39 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 10 Mar 2022 19:44:39 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 13:51:56 GMT, Magnus Ihse Bursie wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > I think he already did. I'm quoting: > >> However, the CDS archive also contains a heap dump, which includes Java HashMaps. If I allow those 3 Java threads to start, some HashMaps in the module graph will have unstable ordering. I think the reason is concurrent thread execution causes unstable assignment of the identity_hash for objects in the heap dump. > @magicus the issue is not the list of classes dumped, or their format in the dump. As Ioi indicated that list is fixed. The issue is with the heap dump part of the archive. Running these other threads affects the heap so by not running them with end up with a different heap. So the question is whether there is anything about having a different heap dumped that we need to be concerned about. We dump the heap into the archive for a reason and this changes what we dump, To be clear, if multiple threads are running, classes could be loaded in a different order and the symbols will have different orders. This would cause the vtables in Klass objects to be laid out differently. Trying to fix this is very difficult. I have a different patch that makes it work but it's just too complicated. So, disabling the threads is also necessary for (easily) sorting the metaspace objects. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From dholmes at openjdk.java.net Thu Mar 10 21:28:43 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Mar 2022 21:28:43 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 19:41:03 GMT, Ioi Lam wrote: >> I think he already did. I'm quoting: >> >>> However, the CDS archive also contains a heap dump, which includes Java HashMaps. If I allow those 3 Java threads to start, some HashMaps in the module graph will have unstable ordering. I think the reason is concurrent thread execution causes unstable assignment of the identity_hash for objects in the heap dump. > >> @magicus the issue is not the list of classes dumped, or their format in the dump. As Ioi indicated that list is fixed. The issue is with the heap dump part of the archive. Running these other threads affects the heap so by not running them with end up with a different heap. So the question is whether there is anything about having a different heap dumped that we need to be concerned about. We dump the heap into the archive for a reason and this changes what we dump, > > To be clear, if multiple threads are running, classes could be loaded in a different order and the symbols will have different orders. This would cause the vtables in Klass objects to be laid out differently. Trying to fix this is very difficult. I have a different patch that makes it work but it's just too complicated. > > So, disabling the threads is also necessary for (easily) sorting the metaspace objects. @iklam thanks for clarifying. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Thu Mar 10 23:24:41 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 10 Mar 2022 23:24:41 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 07:47:19 GMT, Thomas Stuefe wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed zero build > > src/hotspot/share/utilities/hashtable.hpp line 42: > >> 40: >> 41: LP64_ONLY(unsigned int _gap;) >> 42: > > For 64-bit, you now lose packing potential in the theoretical case the following payload does not have to be aligned to 64 bit. E.g. for T=char, where the whole entry would fit into 8 bytes. Probably does not matter as long as entries are allocated individually from C-heap which is a lot more wasteful anyway. > > For 32-bit, I think you may have the same problem if the payload starts with a uint64_t. Would that not be aligned to a 64-bit boundary too? Whether or not you build on 64-bit? > > I think setting the memory, or at least the first 8..16 bytes, of the entry to zero in BasicHashtable::new_entry could be more robust: > > (16 bytes in case the payload starts with a long double but that may be overthinking it :) > > > template BasicHashtableEntry* BasicHashtable::new_entry(unsigned int hashValue) { > char* p = :new (NEW_C_HEAP_ARRAY(char, this->entry_size(), F); > ::memset(p, 0, MIN2(this->entry_size(), 16)); // needs reproducable > BasicHashtableEntry* entry = ::new (p) BasicHashtableEntry(hashValue); > return entry; > } > > If you are worried about performance, this may also be controlled by a template parameter, and then you do it just for the system dictionary. Thanks for pointing this out. I ran more tests and found that on certain platforms, there are other structures that have problems with uninitialized gaps. I ended up changing `os::malloc()` to zero the buffer when running with -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't matter too much for regular VM executions because `os::malloc()` already has a high overhead. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Thu Mar 10 23:20:26 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 10 Mar 2022 23:20:26 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v3] In-Reply-To: References: Message-ID: <0N6pu8Jh3q0V0N7cxQtV7GWcszvjiQzX5W_4ZryJPQY=.4b615784-dd08-426a-8a3e-11bad1843ce8@github.com> > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: improvement zeroing of alignment gaps ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/44db40f1..6974021f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=01-02 Stats: 8 lines in 2 files changed: 3 ins; 2 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From jiefu at openjdk.java.net Fri Mar 11 00:09:43 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Mar 2022 00:09:43 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Thu, 10 Mar 2022 16:32:19 GMT, Paul Sandoz wrote: > I think its OK, to follow up after this with some tests for "polluted" profiles of vectors (which may expose more issues). Given the scope of the fix i would recommend adding a comment in each place as to why we don't switch over the enum constant itself (note we are very careful in other performance critical areas of the JDK to avoid this e.g. in VarHandle code). Hi @PaulSandoz , JDK-8282874 is labeled as a Bug. But there is no description about how to reproduce it and verify the fix. I think it also helps to figure out more bugs of this kind if we provide a jtreg test. That's why I strongly recommend adding a jtreg for this fix. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From yyang at openjdk.java.net Fri Mar 11 02:30:42 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Fri, 11 Mar 2022 02:30:42 GMT Subject: Integrated: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries In-Reply-To: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> References: <5EN1NGCQbWHzmXlngVdIHUQfkTupEAVQ9CTNo3GHCRw=.9280b440-1519-4ba3-9c0e-4780d83b1269@github.com> Message-ID: On Wed, 9 Mar 2022 15:28:29 GMT, Yi Yang wrote: > Some existing JVM_ENTRY routines are behavioral simple, they do not lock, GC or throw exceptions, we could use JVM_LEAF instead of JVM_ENTRY, to avoid ThreadStateTransition and safepoint checks. This pull request has now been integrated. Changeset: a5a1a32d Author: Yi Yang URL: https://git.openjdk.java.net/jdk/commit/a5a1a32db65b98f0d7bae20cf054be2fbbf2cf3a Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries Reviewed-by: dholmes, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/7760 From pli at openjdk.java.net Fri Mar 11 02:34:50 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 11 Mar 2022 02:34:50 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Mon, 21 Feb 2022 06:19:26 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into postloop > > Change-Id: I503edb75f0f626569c776416bfef09651935979c > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Thanks Roland for looking at this. Firstly I'd like to answer your general questions. > fairly similar. Doesn't/couldn't the logic from issue 5 protect from issue 3? True, the SWPointer logic can also protect from issue 3. But I believe keeping loop stride check for issue 3 has no harm and post loop vectorization can bail out earlier with this additional check. > I see no reason not to proceed with this change as it attempts to fix something that's broken and nobody uses, doesn't enable it by default and doesn't affect code other than post loop vectorization. Yes, this feature has been broken for years. And as I mentioned in the [slides](http://cr.openjdk.java.net/~pli/slides/JDK-8183390.pdf), fixing this is our first trial and initial step of adding vector mask support in C2 vectorized loops. > I haven't looked at tests closely but I don't think there are IR tests. Maybe you want to consider adding some as a follow up to this PR. Your guess is right. Current test code in this PR only focuses on the **correctness** of vectorization. We have been using it internally for over 1 year and we have found several superword and AArch64 vector matching rule issues with its help. And internally based on this framework we have some test logic focuses on **vectorizability** as well as **correctness**. But currently it is **not** IR tests based because our test runner logic (it invokes each test method twice, 1st time interpreter and 2nd time compiled by C2) is not compatible with the IR test driver. But currently we are working on it to make a better **vectorizability** check (IR test is already in our consideration). As we think the **correctness** check part is quite useful for adding new cases for C2 vectorization fixes and improvements, we propose to contribute this part first in this PR. Thanks again for you review. I will address your comments, rebase and re-test this patch, and reply you later. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From dholmes at openjdk.java.net Fri Mar 11 04:41:42 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Mar 2022 04:41:42 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX The refactoring is a consequence of the initial TLS change as it pollutes the shared os "namespace" unnecessarily. So I'd want to see this neatly packaged where it belongs please. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From iklam at openjdk.java.net Fri Mar 11 04:56:23 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Mar 2022 04:56:23 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v4] In-Reply-To: References: Message-ID: <10c63f_4lXQrZZumBkYQzjpKoaD_BICnzEIn27u9eyI=.9107ac65-6882-4b79-9f35-d6997a25fca1@github.com> > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: zero GC heap filler arrays ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/6974021f..be7673af Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=02-03 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From david.holmes at oracle.com Fri Mar 11 05:57:20 2022 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Mar 2022 15:57:20 +1000 Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: Message-ID: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> I can't find this comment in the PR so replying via email ... On 11/03/2022 9:24 am, Ioi Lam wrote: > On Wed, 9 Mar 2022 07:47:19 GMT, Thomas Stuefe wrote: > >>> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Fixed zero build >> >> src/hotspot/share/utilities/hashtable.hpp line 42: >> >>> 40: >>> 41: LP64_ONLY(unsigned int _gap;) >>> 42: >> >> For 64-bit, you now lose packing potential in the theoretical case the following payload does not have to be aligned to 64 bit. E.g. for T=char, where the whole entry would fit into 8 bytes. Probably does not matter as long as entries are allocated individually from C-heap which is a lot more wasteful anyway. >> >> For 32-bit, I think you may have the same problem if the payload starts with a uint64_t. Would that not be aligned to a 64-bit boundary too? Whether or not you build on 64-bit? >> >> I think setting the memory, or at least the first 8..16 bytes, of the entry to zero in BasicHashtable::new_entry could be more robust: >> >> (16 bytes in case the payload starts with a long double but that may be overthinking it :) >> >> >> template BasicHashtableEntry* BasicHashtable::new_entry(unsigned int hashValue) { >> char* p = :new (NEW_C_HEAP_ARRAY(char, this->entry_size(), F); >> ::memset(p, 0, MIN2(this->entry_size(), 16)); // needs reproducable >> BasicHashtableEntry* entry = ::new (p) BasicHashtableEntry(hashValue); >> return entry; >> } >> >> If you are worried about performance, this may also be controlled by a template parameter, and then you do it just for the system dictionary. > > Thanks for pointing this out. I ran more tests and found that on certain platforms, there are other structures that have problems with uninitialized gaps. I ended up changing `os::malloc()` to zero the buffer when running with -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't matter too much for regular VM executions because `os::malloc()` already has a high overhead. This is raising red flags for me sorry. Every user of the JDK is now paying a penalty because of something only needed when dumping the shared archive. It might not be much but it is the old "death by a thousand cuts". Is there any way to tell the OS to pre-zero all memory provided to the current process, such that we could set that when dumping and not have to check on each allocation? And I have to wonder how easy it would be to re-introduce non-deterministic values in these data structures that are being dumped. Does malloc itself even guarantee to return the same set of addresses for the same sequence of requests in different executions of a program? Cheers, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7748 From kbarrett at openjdk.java.net Fri Mar 11 05:58:42 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 11 Mar 2022 05:58:42 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v4] In-Reply-To: <10c63f_4lXQrZZumBkYQzjpKoaD_BICnzEIn27u9eyI=.9107ac65-6882-4b79-9f35-d6997a25fca1@github.com> References: <10c63f_4lXQrZZumBkYQzjpKoaD_BICnzEIn27u9eyI=.9107ac65-6882-4b79-9f35-d6997a25fca1@github.com> Message-ID: On Fri, 11 Mar 2022 04:56:23 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > zero GC heap filler arrays Changes requested by kbarrett (Reviewer). src/hotspot/share/gc/shared/collectedHeap.cpp line 449: > 447: allocator.initialize(start); > 448: DEBUG_ONLY(zap_filler_array(start, words, zap);) > 449: if (DumpSharedSpaces) { Probably shouldn't both zap and clear for dumping, to avoid wasting time. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Fri Mar 11 06:07:27 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Mar 2022 06:07:27 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v5] In-Reply-To: References: Message-ID: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @kimbarrett comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/be7673af..584c6572 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=03-04 Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Fri Mar 11 06:07:28 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Mar 2022 06:07:28 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v4] In-Reply-To: References: <10c63f_4lXQrZZumBkYQzjpKoaD_BICnzEIn27u9eyI=.9107ac65-6882-4b79-9f35-d6997a25fca1@github.com> Message-ID: On Fri, 11 Mar 2022 05:55:20 GMT, Kim Barrett wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> zero GC heap filler arrays > > src/hotspot/share/gc/shared/collectedHeap.cpp line 449: > >> 447: allocator.initialize(start); >> 448: DEBUG_ONLY(zap_filler_array(start, words, zap);) >> 449: if (DumpSharedSpaces) { > > Probably shouldn't both zap and clear for dumping, to avoid wasting time. Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From dholmes at openjdk.java.net Fri Mar 11 06:35:31 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Mar 2022 06:35:31 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v4] In-Reply-To: References: Message-ID: > Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. > > This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. David Holmes has updated the pull request incrementally with one additional commit since the last revision: Additional tweaks requested by @kbarrett ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7720/files - new: https://git.openjdk.java.net/jdk/pull/7720/files/c307466c..5e2913e8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7720&range=02-03 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7720.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7720/head:pull/7720 PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Fri Mar 11 06:35:32 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Mar 2022 06:35:32 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v3] In-Reply-To: References: Message-ID: <28eF-aKfL0Fc6TTcOYOBuP79VPgp5azN3npZ8kkqICQ=.78d5471d-f529-459d-959b-483d82cc7a98@github.com> On Thu, 10 Mar 2022 16:02:01 GMT, Kim Barrett wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Move new sentence to 'rationale' paragraph. > > doc/hotspot-style.md line 669: > >> 667: initialization. Dynamic initialization and destruction of >> 668: namespace-scoped thread local variables also has the same ordering >> 669: problems as for ordinary namespace-scoped variables. So we avoid use of > > Should be s/namespace-scoped/non-local/ (two places). This was overlooked by JDK-8272691. (Mea culpa.) And yes, "non-local thread local" (or "non-local thread-local") looks a little weird. The standard (almost) doesn't use "thread-local" as a term, instead talking about variables with "thread storage duration". Fixed. And I changed "thread local" to "`thread_local` for consistency and avoid the weirdness. > doc/hotspot-style.md line 671: > >> 669: problems as for ordinary namespace-scoped variables. So we avoid use of >> 670: `thread_local` in general, limiting its use to only those cases where dynamic >> 671: initialization and destruction are essential. See > > Consider s/and/or/. Changed ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From xgong at openjdk.java.net Fri Mar 11 06:37:00 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Fri, 11 Mar 2022 06:37:00 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API Message-ID: The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. So with this patch, the following transformations are applied: For non-masked negation with NEON: movi v16.4s, #0x0 sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s and with SVE: mov z16.s, #0 sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s For masked negation with NEON: movi v17.4s, #0x1 mvn v19.16b, v18.16b mov v20.16b, v16.16b ==> neg v18.4s, v17.4s bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b add v19.4s, v20.4s, v17.4s mov v18.16b, v16.16b bsl v18.16b, v19.16b, v20.16b and with SVE: mov z16.s, #-1 mov z17.s, #1 ==> neg z16.s, p0/m, z16.s eor z18.s, p0/m, z18.s, z16.s add z18.s, p0/m, z18.s, z17.s Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): NEON: Benchmark Gain Byte128Vector.NEG 1.029 Byte128Vector.NEGMasked 1.757 Short128Vector.NEG 1.041 Short128Vector.NEGMasked 1.659 Int128Vector.NEG 1.005 Int128Vector.NEGMasked 1.513 Long128Vector.NEG 1.003 Long128Vector.NEGMasked 1.878 SVE with 512-bits: Benchmark Gain ByteMaxVector.NEG 1.10 ByteMaxVector.NEGMasked 1.165 ShortMaxVector.NEG 1.056 ShortMaxVector.NEGMasked 1.195 IntMaxVector.NEG 1.002 IntMaxVector.NEGMasked 1.239 LongMaxVector.NEG 1.031 LongMaxVector.NEGMasked 1.191 X86 (non AVX-512): Benchmark Gain ByteMaxVector.NEGMasked 1.254 ShortMaxVector.NEGMasked 1.359 IntMaxVector.NEGMasked 1.431 LongMaxVector.NEGMasked 1.989 [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 ------------- Commit messages: - 8282162: [vector] Optimize vector negation API Changes: https://git.openjdk.java.net/jdk/pull/7782/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7782&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282162 Stats: 308 lines in 15 files changed: 267 ins; 25 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/7782.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7782/head:pull/7782 PR: https://git.openjdk.java.net/jdk/pull/7782 From iklam at openjdk.java.net Fri Mar 11 06:40:47 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Mar 2022 06:40:47 GMT Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> References: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> Message-ID: On Fri, 11 Mar 2022 05:59:00 GMT, David Holmes wrote: > > I ended up changing `os::malloc()` to zero the buffer when running with -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't matter too much for regular VM executions because `os::malloc()` already has a high overhead. > > This is raising red flags for me sorry. Every user of the JDK is now paying a penalty because of something only needed when dumping the shared archive. It might not be much but it is the old "death by a thousand cuts". Is there any way to tell the OS to pre-zero all memory provided to the current process, such that we could set that when dumping and not have to check on each allocation? I don't know how to tell the OS (or C library) to zero out the buffer returned by malloc. However, in the current code path, we already have a test for an uncommon condition when `os::malloc()` calls `MemTracker::record_malloc()` which calls `MallocTracker::record_malloc()` void* MallocTracker::record_malloc(void* malloc_base, size_t size, MEMFLAGS flags, const NativeCallStack& stack) { if (MemTracker::tracking_level() == NMT_detail) { MallocSiteTable::allocation_at(stack, size, &mst_marker, flags); } I can combine the tests for `MemTracker::tracking_level()` and `DumpSharedSpaces` into a single test and do more work only when the uncommon path is taken. This would require some refactoring of the MemTracker/MallocTracker code. I'd rather do that in a separate RFE. In fact, `MemTracker::_tracking_level` is tested twice in the current implementation. We can change it to do a single test in the most common case (NMT_summary) if we really want to cut down the number of tests. But honestly I don't think this makes any difference. > And I have to wonder how easy it would be to re-introduce non-deterministic values in these data structures that are being dumped. Does malloc itself even guarantee to return the same set of addresses for the same sequence of requests in different executions of a program? The malloc'ed objects are copied into the CDS archive at deterministic addresses. Any pointers inside such objects will be relocated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From david.holmes at oracle.com Fri Mar 11 06:48:07 2022 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Mar 2022 16:48:07 +1000 Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: References: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> Message-ID: <3948464a-0a50-0155-3b42-28998b97801c@oracle.com> On 11/03/2022 4:40 pm, Ioi Lam wrote: > On Fri, 11 Mar 2022 05:59:00 GMT, David Holmes wrote: > >>> I ended up changing `os::malloc()` to zero the buffer when running with -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't matter too much for regular VM executions because `os::malloc()` already has a high overhead. >> >> This is raising red flags for me sorry. Every user of the JDK is now paying a penalty because of something only needed when dumping the shared archive. It might not be much but it is the old "death by a thousand cuts". Is there any way to tell the OS to pre-zero all memory provided to the current process, such that we could set that when dumping and not have to check on each allocation? > > I don't know how to tell the OS (or C library) to zero out the buffer returned by malloc. However, in the current code path, we already have a test for an uncommon condition when `os::malloc()` calls `MemTracker::record_malloc()` which calls `MallocTracker::record_malloc()` > > > void* MallocTracker::record_malloc(void* malloc_base, size_t size, MEMFLAGS flags, > const NativeCallStack& stack) > { > if (MemTracker::tracking_level() == NMT_detail) { > MallocSiteTable::allocation_at(stack, size, &mst_marker, flags); > } > > > I can combine the tests for `MemTracker::tracking_level()` and `DumpSharedSpaces` into a single test and do more work only when the uncommon path is taken. This would require some refactoring of the MemTracker/MallocTracker code. I'd rather do that in a separate RFE. > > In fact, `MemTracker::_tracking_level` is tested twice in the current implementation. We can change it to do a single test in the most common case (NMT_summary) if we really want to cut down the number of tests. But honestly I don't think this makes any difference. > >> And I have to wonder how easy it would be to re-introduce non-deterministic values in these data structures that are being dumped. Does malloc itself even guarantee to return the same set of addresses for the same sequence of requests in different executions of a program? > > The malloc'ed objects are copied into the CDS archive at deterministic addresses. Any pointers inside such objects will be relocated. Okay. I won't object further but I really don't like it - c'est la vie! I'll let others review the actual code changes in detail. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Fri Mar 11 06:50:43 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 06:50:43 GMT Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> References: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> Message-ID: <3joACrACnlZ1rQDb6XV2MUdfdtPCq5ZnhXzWT4fi_MA=.a02524ef-9488-4e71-bfa1-58acf25ff080@github.com> On Fri, 11 Mar 2022 05:59:00 GMT, David Holmes wrote: > > Thanks for pointing this out. I ran more tests and found that on certain platforms, there are other structures that have problems with uninitialized gaps. I ended up changing `os::malloc()` to zero the buffer when running with -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't matter too much for regular VM executions because `os::malloc()` already has a high overhead. > > This is raising red flags for me sorry. Every user of the JDK is now paying a penalty because of something only needed when dumping the shared archive. It might not be much but it is the old "death by a thousand cuts". Well, he does it for `DumpSharedSpaces` only. Are you really worried about that one load+conditional jump? @iklam: I dislike the fact that CDS terminology is now in os::malloc. I would give this another flag, "ZapMalloc" or similar, and maybe merge it with the `DEBUG_ONLY(memset(..uninitBlockPad))` above. > Is there any way to tell the OS to pre-zero all memory provided to the current process, such that we could set that when dumping and not have to check on each allocation? No. Malloced memory is not provided by the OS but by the glibc, and it may be polluted with whatever the allocator did with it (eg pointers to chain free blocks), or by prior user payload. Glibc has a specific tunable, `glibc.malloc.perturb`, that initializes malloc memory to a given value, but you cannot directly set the value. It is very handy to check for uninitialized memory. Always wanted to add some tests that used it, but never got around. But obviously its nothing you could do in production. > > And I have to wonder how easy it would be to re-introduce non-deterministic values in these data structures that are being dumped. Does malloc itself even guarantee to return the same set of addresses for the same sequence of requests in different executions of a program? No, of course not. Non-Java threads may still run concurrently, no? System libraries do malloc too. But are pointer *values* even the problem? ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Fri Mar 11 06:55:23 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Mar 2022 06:55:23 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v6] In-Reply-To: References: Message-ID: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Added helper function CollectedHeap::zap_filler_array_with ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/584c6572..47e0238a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=04-05 Stats: 10 lines in 2 files changed: 6 ins; 2 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Fri Mar 11 06:55:24 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 06:55:24 GMT Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: <3948464a-0a50-0155-3b42-28998b97801c@oracle.com> References: <3948464a-0a50-0155-3b42-28998b97801c@oracle.com> Message-ID: <6vk6z5DuSwcSGtsGueROK1tL108ymh60lIi0SXL_Eh4=.ea5accf2-836f-49a0-b90d-1dbc77b827e3@github.com> On Fri, 11 Mar 2022 06:50:00 GMT, David Holmes wrote: > I can combine the tests for `MemTracker::tracking_level()` and `DumpSharedSpaces` into a single test and do more work only when the uncommon path is taken. This would require some refactoring of the MemTracker/MallocTracker code. I'd rather do that in a separate RFE. > > In fact, `MemTracker::_tracking_level` is tested twice in the current implementation. We can change it to do a single test in the most common case (NMT_summary) if we really want to cut down the number of tests. But honestly I don't think this makes any difference. > Before going down that road, I would really like to see some measurements, whether this really matters. malloc is not blindingly fast. The malloc code in glibc does test a lot of conditions too. If you need fast allocation (or well packed, for that matter), you need another allocator. Thats why we have Arenas. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From kbarrett at openjdk.java.net Fri Mar 11 07:00:42 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 11 Mar 2022 07:00:42 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:35:31 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Additional tweaks requested by @kbarrett Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From david.holmes at oracle.com Fri Mar 11 07:01:06 2022 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Mar 2022 17:01:06 +1000 Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: <3joACrACnlZ1rQDb6XV2MUdfdtPCq5ZnhXzWT4fi_MA=.a02524ef-9488-4e71-bfa1-58acf25ff080@github.com> References: <2b212659-d920-171c-867a-7dbea8a1a69b@oracle.com> <3joACrACnlZ1rQDb6XV2MUdfdtPCq5ZnhXzWT4fi_MA=.a02524ef-9488-4e71-bfa1-58acf25ff080@github.com> Message-ID: On 11/03/2022 4:50 pm, Thomas Stuefe wrote: > On Fri, 11 Mar 2022 05:59:00 GMT, David Holmes wrote: > >>> Thanks for pointing this out. I ran more tests and found that on certain platforms, there are other structures that have problems with uninitialized gaps. I ended up changing `os::malloc()` to zero the buffer when running with -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't matter too much for regular VM executions because `os::malloc()` already has a high overhead. >> >> This is raising red flags for me sorry. Every user of the JDK is now paying a penalty because of something only needed when dumping the shared archive. It might not be much but it is the old "death by a thousand cuts". > > Well, he does it for `DumpSharedSpaces` only. Are you really worried about that one load+conditional jump? As I said (and I'm not the only one who says this :) ) "death by a thousand cuts". > @iklam: I dislike the fact that CDS terminology is now in os::malloc. I would give this another flag, "ZapMalloc" or similar, and maybe merge it with the `DEBUG_ONLY(memset(..uninitBlockPad))` above. > >> Is there any way to tell the OS to pre-zero all memory provided to the current process, such that we could set that when dumping and not have to check on each allocation? > > No. Malloced memory is not provided by the OS but by the glibc, and it may be polluted with whatever the allocator did with it (eg pointers to chain free blocks), or by prior user payload. > > Glibc has a specific tunable, `glibc.malloc.perturb`, that initializes malloc memory to a given value, but you cannot directly set the value. It is very handy to check for uninitialized memory. Always wanted to add some tests that used it, but never got around. But obviously its nothing you could do in production. > >> >> And I have to wonder how easy it would be to re-introduce non-deterministic values in these data structures that are being dumped. Does malloc itself even guarantee to return the same set of addresses for the same sequence of requests in different executions of a program? > > No, of course not. Non-Java threads may still run concurrently, no? System libraries do malloc too. But are pointer *values* even the problem? I assumed that pointer values could be the problem, even if not presently, but apparently all the pointers get rewritten to use offsets from the known base of the shared archive - so not an issue. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7748 From kbarrett at openjdk.java.net Fri Mar 11 07:06:40 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 11 Mar 2022 07:06:40 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v6] In-Reply-To: References: Message-ID: <5ODb6bJ8c7yanEh6JuuOTxH5Wh7L_CAzi6n17VtOINM=.db57ef37-6138-4c6a-84f4-f2371244248e@github.com> On Fri, 11 Mar 2022 06:55:23 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Added helper function CollectedHeap::zap_filler_array_with GC changes look good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7748 From duke at openjdk.java.net Fri Mar 11 07:11:41 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Fri, 11 Mar 2022 07:11:41 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <7g4Pfmc7A2hkvw5kQe7Wgj0EzpwH_OdEHPc-O-IxEVM=.60b394b4-c00b-4c66-b262-6e7eec43289b@github.com> On Wed, 9 Mar 2022 08:35:41 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > current_thread_wx -> ThreadWX I'm also personally in favour of packaging it in the OS specific header. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Fri Mar 11 07:16:41 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 07:16:41 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: <1E_zgqBQtByT4cXyk_dlbXGRVAQpCI6jlKXFIYovvVU=.3597bbf1-2ae7-4d41-9bef-79f77c90e8d3@github.com> References: <1E_zgqBQtByT4cXyk_dlbXGRVAQpCI6jlKXFIYovvVU=.3597bbf1-2ae7-4d41-9bef-79f77c90e8d3@github.com> Message-ID: On Thu, 10 Mar 2022 19:34:29 GMT, Ioi Lam wrote: >> src/hotspot/share/prims/jvm.cpp line 2887: >> >>> 2885: return; >>> 2886: } >>> 2887: #endif >> >> Should we do this for jni_AttachCurrentThread too? > > This hasn't been necessary for me because jni_AttachCurrentThread is not called during "java -Xshare:dump", which executes under a very strict condition and doesn't normally allow arbitrary JNI libraries to be loaded. Is reproducibility also a topic for users calling -Xdump with custom JNI coding? Or maybe having the VM instrumented somehow? Since it seems such an easy fix, I would prevent attaching too. At least the user would get a clear error message. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Fri Mar 11 07:21:49 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 07:21:49 GMT Subject: RFR: 8253495: CDS generates non-deterministic output In-Reply-To: References: Message-ID: <_qq1q2olwT_11EF476cn5DsQHdl-b8P-L2nJkGn3Aqk=.dfb932f7-a43e-4730-b4a1-7356efd7defa@github.com> On Fri, 11 Mar 2022 07:03:00 GMT, David Holmes wrote: > > Well, he does it for `DumpSharedSpaces` only. Are you really worried about that one load+conditional jump? > > As I said (and I'm not the only one who says this :) ) "death by a thousand cuts". Thank you, that is a good expression. I just wanted to clarify what you meant. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From pli at openjdk.java.net Fri Mar 11 07:48:06 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 11 Mar 2022 07:48:06 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v5] In-Reply-To: References: Message-ID: > ### Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ### Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > **I) C2 crashes with segmentation fault in strip-mined loops** > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > **II) Incorrect result issues with post loop vectorization** > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > - **[Issue-1] Incorrect vectorization for partial vectorizable loops** > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > - **[Issue-2] Incorrect result in loops with growing-down vectors** > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > - **[Issue-3] Incorrect result in manually unrolled loops** > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > - **[Issue-4] Incorrect result in loops with mixed vector element sizes** > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > - **[Issue-5] Incorrect result in loops with potential data dependence** > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ### Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ### Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Add assertion of PostLoopMultiversioning - Merge branch 'master' into postloop - Merge branch 'master' into postloop Change-Id: I503edb75f0f626569c776416bfef09651935979c - Update copyright year and rename a function Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb - Merge branch 'master' into postloop Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 - Fix issues in newly added test framework Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 - Merge branch 'master' into postloop Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 - 8183390: Fix and re-enable post loop vectorization ** Background Post loop vectorization is a C2 compiler optimization in an experimental VM feature called PostLoopMultiversioning. It transforms the range-check eliminated post loop to a 1-iteration vectorized loop with vector mask. This optimization was contributed by Intel in 2016 to support x86 AVX512 masked vector instructions. However, it was disabled soon after an issue was found. Due to insufficient maintenance in these years, multiple bugs have been accumulated inside. But we (Arm) still think this is a useful framework for vector mask support in C2 auto-vectorized loops, for both x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable post loop vectorization. ** Changes in this patch This patch reworks post loop vectorization. The most significant change is removing vector mask support in C2 x86 backend and re-implementing it in the mid-end. With this, we can re-enable post loop vectorization for platforms other than x86. Previous implementation hard-codes x86 k1 register as a reserved AVX512 opmask register and defines two routines (setvectmask/restorevectmask) to set and restore the value of k1. But after JDK-8211251 which encodes AVX512 instructions as unmasked by default, generated vector masks are no longer used in AVX512 vector instructions. To fix incorrect codegen and add vector mask support for more platforms, we turn to add a vector mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode to generate a mask and replace all Load/Store nodes in the post loop into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This IR form is exactly the same to those which are used in VectorAPI mask support. For now, we only add mask inputs for Load/Store nodes because we don't have reduction operations supported in post loop vectorization. After this change, the x86 k1 register is no longer reserved and can be allocated when PostLoopMultiversioning is enabled. Besides this change, we have fixed a compiler crash and five incorrect result issues with post loop vectorization. - 1) C2 crashes with segmentation fault in strip-mined loops Previous implementation was done before C2 loop strip-mining was merged into JDK master so it didn't take strip-mined loops into consideration. In C2's strip mined loops, post loop is not the sibling of the main loop in ideal loop tree. Instead, it's the sibling of the main loop's parent. This patch fixed a SIGSEGV issue caused by NULL pointer when locating post loop from strip-mined main loop. - 2) Incorrect result issues with post loop vectorization We have also fixed five incorrect vectorization issues. Some of them are hidden deep and can only be reproduced with corner cases. These issues have a common cause that it assumes the post loop can be vectorized if the vectorization in corresponding main loop is successful. But in many cases this assumption is wrong. Below are details. [Issue-1] Incorrect vectorization for partial vectorizable loops This issue can be reproduced by below loop where only some operations in the loop body are vectorizable. for (int i = 0; i < 10000; i++) { res[i] = a[i] * b[i]; k = 3 * k + 1; } In the main loop, superword can work well if parts of the operations in loop body are not vectorizable since those parts can be unrolled only. But for post loops, we don't create vectors through combining scalar IRs generated from loop unrolling. Instead, we are doing scalars to vectors replacement for all operations in the loop body. Hence, all operations should be either vectorized together or not vectorized at all. To fix this kind of cases, we add an extra field "_slp_vector_pack_count" in CountedLoopNode to record the eventual count of vector packs in the main loop. This value is then passed to post loop and compared with post loop pack count. Vectorization will be bailed out in post loop if it creates more vector packs than in the main loop. [Issue-2] Incorrect result in loops with growing-down vectors This issue appears with growing-down vectors, that is, vectors that grow to smaller memory address as the loop iterates. It can be reproduced by below counting-up loop with negative scale value in array index. for (int i = 0; i < 10000; i++) { a[MAX - i] = b[MAX - i]; } Cause of this issue is that for a growing-down vector, generated vector mask value has reversed vector-lane order so it masks incorrect vector lanes. Note that if negative scale value appears in counting-down loops, the vector will be growing up. With this rule, we fix the issue by only allowing positive array index scales in counting-up loops and negative array index scales in counting-down loops. This check is done with the help of SWPointer by comparing scale values in each memory access in the loop with loop stride value. [Issue-3] Incorrect result in manually unrolled loops This issue can be reproduced by below manually unrolled loop. for (int i = 0; i < 10000; i += 2) { c[i] = a[i] + b[i]; c[i + 1] = a[i + 1] * b[i + 1]; } In this loop, operations in the 2nd statement duplicate those in the 1st statement with a small memory address offset. Vectorization in the main loop works well in this case because C2 does further unrolling and pack combination. But we cannot vectorize the post loop through replacement from scalars to vectors because it creates duplicated vector operations. To fix this, we restrict post loop vectorization to loops with stride values of 1 or -1. [Issue-4] Incorrect result in loops with mixed vector element sizes This issue is found after we enable post loop vectorization for AArch64. It's reproducible by multiple array operations with different element sizes inside a loop. On x86, there is no issue because the values of x86 AVX512 opmasks only depend on which vector lanes are active. But AArch64 is different - the values of SVE predicates also depend on lane size of the vector. Hence, on AArch64 SVE, if a loop has mixed vector element sizes, we should use different vector masks. For now, we just support loops with only one vector element size, i.e., "int + float" vectors in a single loop is ok but "int + double" vectors in a single loop is not vectorizable. This fix also enables subword vectors support to make all primitive type array operations vectorizable. [Issue-5] Incorrect result in loops with potential data dependence This issue can be reproduced by below corner case on AArch64 only. for (int i = 0; i < 10000; i++) { a[i] = x; a[i + OFFSET] = y; } In this case, two stores in the loop have data dependence if the OFFSET value is smaller than the vector length. So we cannot do vectorization through replacing scalars to vectors. But the main loop vectorization in this case is successful on AArch64 because AArch64 has partial vector load/store support. It splits vector fill with different values in lanes to several smaller-sized fills. In this patch, we add additional data dependence check for this kind of cases. The check is also done with the help of SWPointer class. In this check, we require that every two memory accesses (with at least one store) of the same element type (or subword size) in the loop has the same array index expression. ** Tests So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with experimental VM option "PostLoopMultiversioning" turned on. We found no issue in all tests. We notice that those existing cases are not enough because some of above issues are not spotted by them. We would like to add some new cases but we found existing vectorization tests are a bit cumbersome - golden results must be pre-calculated and hard-coded in the test code for correctness verification. Thus, in this patch, we propose a new vectorization testing framework. Our new framework brings a simpler way to add new cases. For a new test case, we only need to create a new method annotated with "@Test". The test runner will invoke each annotated method twice automatically. First time it runs in the interpreter and second time it's forced compiled by C2. Then the two return results are compared. So in this framework each test method should return a primitive value or an array of primitives. In this way, no extra verification code for vectorization correctness is required. This test runner is still jtreg-based and takes advantages of the jtreg WhiteBox API, which enables test methods running at specific compilation levels. Each test class inside is also jtreg-based. It just need to inherit from the test runner class and run with two additional options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". ** Summary & Future work In this patch, we reworked post loop vectorization. We made it platform independent and fixed several issues inside. We also implemented a new vectorization testing framework with many test cases inside. Meanwhile, we did some code cleanups. This patch only touches C2 code guarded with PostLoopMultiversioning, except a few data structure changes. So, there's no behavior change when experimental VM option PostLoopMultiversioning is off. Also, to reduce risks, we still propose to keep post loop vectorization experimental for now. But if it receives positive feedback, we would like to change it to non-experimental in the future. ------------- Changes: https://git.openjdk.java.net/jdk/pull/6828/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=04 Stats: 4832 lines in 39 files changed: 4509 ins; 284 del; 39 mod Patch: https://git.openjdk.java.net/jdk/pull/6828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6828/head:pull/6828 PR: https://git.openjdk.java.net/jdk/pull/6828 From duke at openjdk.java.net Fri Mar 11 07:52:16 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Fri, 11 Mar 2022 07:52:16 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Remove two unnecessary lines ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7727/files - new: https://git.openjdk.java.net/jdk/pull/7727/files/f206e6d2..cb1255f5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7727&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7727.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7727/head:pull/7727 PR: https://git.openjdk.java.net/jdk/pull/7727 From amenkov at openjdk.java.net Fri Mar 11 08:30:36 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Fri, 11 Mar 2022 08:30:36 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes [v2] In-Reply-To: References: Message-ID: <6DaYBw9QGTWnMoEkibkc3QQJSG652mvhs9tLJz-tXH0=.232fe2fe-8cbe-484e-84a9-5d686712c564@github.com> On Fri, 4 Mar 2022 17:12:51 GMT, Alex Menkov wrote: >> JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. >> We still have some incorrect code in CP merge during class redefinition. >> >> rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. >> >> In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. >> >> tested: >> - test/jdk/java/lang/instrument >> - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Reworked the test Ping. Need 2nd reviewer. ------------- PR: https://git.openjdk.java.net/jdk/pull/7676 From iklam at openjdk.java.net Fri Mar 11 08:31:45 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Mar 2022 08:31:45 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: <1E_zgqBQtByT4cXyk_dlbXGRVAQpCI6jlKXFIYovvVU=.3597bbf1-2ae7-4d41-9bef-79f77c90e8d3@github.com> Message-ID: On Fri, 11 Mar 2022 07:13:35 GMT, Thomas Stuefe wrote: > Is reproducibility also a topic for users calling -Xdump with custom JNI coding? Or maybe having the VM instrumented somehow? Since it seems such an easy fix, I would prevent attaching too. At least the user would get a clear error message. It's impossible to execute arbitrary Java code when running "java -Xshare:dump", so this means there's no way to load a JNI library when creating a *static* CDS archive. The loading of JVMTI agents is also not supported. So this is not a case we need to handle. During *dynamic* CDS dumps, arbitrary Java code can execute, but we don't have a requirement for the *dynamic* CDS archive to be deterministic (at least not for now). ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From jzhu at openjdk.java.net Fri Mar 11 08:48:40 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Fri, 11 Mar 2022 08:48:40 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Wed, 9 Mar 2022 12:33:49 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. > Please help review. As my description at the beginning of both PRs, using gather/scatter VectorAPI for Integer and Long in the same application, this issue could be reproduced easily. Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time. This change fixes the non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. But even with this fix, the performance still cannot reach optimum. As mentioned before in this PR, besides this fix, there exists another issue that will also affect the performance since delay vector inlining. The 2nd issue can be skirted around by disabling delay vector inlining manually. I made an initial triage for the 2nd issue. I had thought that force gvn by IncrementalInlineForceCleanup should help solve it. But ConstraintCastNode's StrongDependency made its own identity() lose effect. That's why I propose to add a benchmark after we figure out the solution to 2nd issue. I agree with Paul that we could follow up with more tests afterward. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From jiefu at openjdk.java.net Fri Mar 11 09:05:44 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Mar 2022 09:05:44 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Fri, 11 Mar 2022 08:45:12 GMT, Joshua Zhu wrote: > Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time. Thanks @JoshuaZhuwj for your clarification. So we can create a jtreg test based on CheckAssembly.java, right? If you are busy to do so, we'd like to have a try. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From jzhu at openjdk.java.net Fri Mar 11 09:14:42 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Fri, 11 Mar 2022 09:14:42 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Fri, 11 Mar 2022 09:02:18 GMT, Jie Fu wrote: > > Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time. > > Thanks @JoshuaZhuwj for your clarification. So we can create a jtreg test based on CheckAssembly.java, right? > > If you are busy to do so, we'd like to have a try. What do you think? Of course. Thanks @DamonFool I think it would be better to sync with Paul on how to design tests for "polluted" profiles of vectors mentioned before instead of a single jtreg test for this case. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From tschatzl at openjdk.java.net Fri Mar 11 09:33:52 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 11 Mar 2022 09:33:52 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 Seems good to me. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7126 From aph at openjdk.java.net Fri Mar 11 09:36:45 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 11 Mar 2022 09:36:45 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Thu, 10 Mar 2022 14:09:24 GMT, Anton Kozlov wrote: > > Depending on what the pthread library call does, and if it's a real function call into a library, it would be more expensive than that. > > Yes, unfortunately we need something like this. But we don't need to speculate. If thread-local variables are cheap on MacOS, and there is no reason why they should be expensive, then we can stop worrying and just use a thread-local variable for WX state. We can measure how long it takes, and we only have to care about one platform, MacOS/AArch64. We could also redefine SafeFetch on MacOS/AArch64 to not need WX. We could do this by statically generating SafeFetch on that platform, and it wouldn't be in the JIT region at all. Why not just do that? ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jiefu at openjdk.java.net Fri Mar 11 09:46:37 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Mar 2022 09:46:37 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Fri, 11 Mar 2022 09:11:20 GMT, Joshua Zhu wrote: > I think it would be better to sync with Paul on how to design tests for "polluted" profiles of vectors mentioned before instead of a single jtreg test for this case. Okay. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From duke at openjdk.java.net Fri Mar 11 09:53:40 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Fri, 11 Mar 2022 09:53:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Fri, 11 Mar 2022 09:33:40 GMT, Andrew Haley wrote: > But we don't need to speculate. If thread-local variables are cheap on MacOS, and there is no reason why they should be expensive, then we can stop worrying and just use a thread-local variable for WX state. We can measure how long it takes, and we only have to care about one platform, MacOS/AArch64. According to https://forums.swift.org/t/concurrencys-use-of-thread-local-variables/48654: "these accesses are just a move from a system register plus a load/store at a constant offset." ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From pli at openjdk.java.net Fri Mar 11 10:07:46 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 11 Mar 2022 10:07:46 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 09:35:43 GMT, Roland Westrelin wrote: >> Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into postloop >> >> Change-Id: I503edb75f0f626569c776416bfef09651935979c >> - Update copyright year and rename a function >> >> Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb >> - Merge branch 'master' into postloop >> >> Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 >> - Fix issues in newly added test framework >> >> Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 >> - Merge branch 'master' into postloop >> >> Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 >> - 8183390: Fix and re-enable post loop vectorization >> >> ** Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ** Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after JDK-8211251 which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> - 1) C2 crashes with segmentation fault in strip-mined loops >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> - 2) Incorrect result issues with post loop vectorization >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> [Issue-1] Incorrect vectorization for partial vectorizable loops >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> [Issue-2] Incorrect result in loops with growing-down vectors >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> [Issue-3] Incorrect result in manually unrolled loops >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> [Issue-4] Incorrect result in loops with mixed vector element sizes >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> [Issue-5] Incorrect result in loops with potential data dependence >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ** Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ** Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > src/hotspot/share/opto/loopnode.cpp line 4443: > >> 4441: CountedLoopNode *cl = lpt->_head->as_CountedLoop(); >> 4442: >> 4443: if (cl->is_rce_post_loop() && !cl->is_vectorized_loop()) { > > Maybe assert that PostLoopMultiversioning is true? Done in updated patch ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From stuefe at openjdk.java.net Fri Mar 11 10:14:43 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 10:14:43 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Fri, 11 Mar 2022 09:33:40 GMT, Andrew Haley wrote: > We could also redefine SafeFetch on MacOS/AArch64 to not need WX. We could do this by statically generating SafeFetch on that platform, and it wouldn't be in the JIT region at all. Why not just do that? Do you mean using inline assembly? ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From pli at openjdk.java.net Fri Mar 11 10:15:46 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 11 Mar 2022 10:15:46 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 09:35:59 GMT, Roland Westrelin wrote: >> Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into postloop >> >> Change-Id: I503edb75f0f626569c776416bfef09651935979c >> - Update copyright year and rename a function >> >> Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb >> - Merge branch 'master' into postloop >> >> Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 >> - Fix issues in newly added test framework >> >> Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 >> - Merge branch 'master' into postloop >> >> Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 >> - 8183390: Fix and re-enable post loop vectorization >> >> ** Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ** Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after JDK-8211251 which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> - 1) C2 crashes with segmentation fault in strip-mined loops >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> - 2) Incorrect result issues with post loop vectorization >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> [Issue-1] Incorrect vectorization for partial vectorizable loops >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> [Issue-2] Incorrect result in loops with growing-down vectors >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> [Issue-3] Incorrect result in manually unrolled loops >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> [Issue-4] Incorrect result in loops with mixed vector element sizes >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> [Issue-5] Incorrect result in loops with potential data dependence >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ** Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ** Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > src/hotspot/share/opto/superword.hpp line 616: > >> 614: //------------------------------SWPointer--------------------------- >> 615: // Information about an address for dependence checking and vector alignment >> 616: class SWPointer : public ResourceObj { > > Why is this required? In function `SuperWord::create_post_loop_vmask()`, SWPointer objects are allocated in arena with `new (_arena)` so the class should inherit from `ResourceObj`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From pli at openjdk.java.net Fri Mar 11 10:24:47 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 11 Mar 2022 10:24:47 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v5] In-Reply-To: References: Message-ID: <9y78fy0fyxl5VebzI-dQghjLhN7l-CnT2TwZ2Plcwxs=.70324394-4a40-4a37-a080-fcf515a75c9f@github.com> On Thu, 10 Mar 2022 09:37:15 GMT, Roland Westrelin wrote: >> Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: >> >> - Add assertion of PostLoopMultiversioning >> - Merge branch 'master' into postloop >> - Merge branch 'master' into postloop >> >> Change-Id: I503edb75f0f626569c776416bfef09651935979c >> - Update copyright year and rename a function >> >> Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb >> - Merge branch 'master' into postloop >> >> Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 >> - Fix issues in newly added test framework >> >> Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 >> - Merge branch 'master' into postloop >> >> Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 >> - 8183390: Fix and re-enable post loop vectorization >> >> ** Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ** Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after JDK-8211251 which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> - 1) C2 crashes with segmentation fault in strip-mined loops >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> - 2) Incorrect result issues with post loop vectorization >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> [Issue-1] Incorrect vectorization for partial vectorizable loops >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> [Issue-2] Incorrect result in loops with growing-down vectors >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> [Issue-3] Incorrect result in manually unrolled loops >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> [Issue-4] Incorrect result in loops with mixed vector element sizes >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> [Issue-5] Incorrect result in loops with potential data dependence >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ** Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ** Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > src/hotspot/share/opto/superword.cpp line 114: > >> 112: if (post_loop_allowed) { >> 113: if (cl->is_reduction_loop()) return; // no predication mapping >> 114: Node *limit = cl->limit(); > > Why was this required but no longer is? All the checks still exist. In this patch I unified the checks and put some of them together to reduce duplicated code. The multiversioned post loop checks everywhere are replaced by `is_rce_post_loop()`. Predicated vector check is moved to `IdealLoopTree::iteration_split_impl()` when inserting the multiversioned post loop, as I think there is no much value to do multiversioning if current architecture doesn't support vector masks. And the reduction loop check is still there in `SuperWord::transform_loop()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From aph at openjdk.java.net Fri Mar 11 10:30:42 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 11 Mar 2022 10:30:42 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove two unnecessary lines On 3/11/22 10:12, Thomas Stuefe wrote: > We could also redefine SafeFetch on MacOS/AArch64 to not need WX. We could do this by statically generating SafeFetch on that platform, and it wouldn't be in the JIT region at all. Why not just do that? > > Do you mean using inline assembly? I'd use out-of-line assembly, as I do for atomic compare-and-swap on linux: https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S But I guess inline would work. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Fri Mar 11 10:48:49 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 10:48:49 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 10:27:25 GMT, Andrew Haley wrote: > On 3/11/22 10:12, Thomas Stuefe wrote: We could also redefine SafeFetch on MacOS/AArch64 to not need WX. We could do this by statically generating SafeFetch on that platform, and it wouldn't be in the JIT region at all. Why not just do that? Do you mean using inline assembly? > I'd use out-of-line assembly, as I do for atomic compare-and-swap on linux: https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S But I guess inline would work. Oh, this is neat. It would work on all platforms too, or on all we care to implement it for. And it would nicely solve the initialization window problem since it would work before stub routines are generated. We could throw `CanUseSafeFetch` away. It seems we already do static assembly on bsd aarch. So there is already a path to follow. But this could also be done as a follow up enhancement. I still like the OS TLS variable idea. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Fri Mar 11 11:38:38 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 11:38:38 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v2] In-Reply-To: References: <1E_zgqBQtByT4cXyk_dlbXGRVAQpCI6jlKXFIYovvVU=.3597bbf1-2ae7-4d41-9bef-79f77c90e8d3@github.com> Message-ID: On Fri, 11 Mar 2022 08:28:32 GMT, Ioi Lam wrote: >> Is reproducibility also a topic for users calling -Xdump with custom JNI coding? Or maybe having the VM instrumented somehow? Since it seems such an easy fix, I would prevent attaching too. At least the user would get a clear error message. > >> Is reproducibility also a topic for users calling -Xdump with custom JNI coding? Or maybe having the VM instrumented somehow? Since it seems such an easy fix, I would prevent attaching too. At least the user would get a clear error message. > > It's impossible to execute arbitrary Java code when running "java -Xshare:dump", so this means there's no way to load a JNI library when creating a *static* CDS archive. The loading of JVMTI agents is also not supported. So this is not a case we need to handle. > > During *dynamic* CDS dumps, arbitrary Java code can execute, but we don't have a requirement for the *dynamic* CDS archive to be deterministic (at least not for now). Thanks for that clarification. Never mind then :) ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From chagedorn at openjdk.java.net Fri Mar 11 12:00:54 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 11 Mar 2022 12:00:54 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 Thanks a lot Thomas for your careful review! ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From roland at openjdk.java.net Fri Mar 11 12:08:49 2022 From: roland at openjdk.java.net (Roland Westrelin) Date: Fri, 11 Mar 2022 12:08:49 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v5] In-Reply-To: References: Message-ID: <1SoIxCUdIYtHkAElb87zsSPULJsm56Wgij2ikm9ZjB4=.581dad35-41de-4327-9a79-4ea02c9f5121@github.com> On Fri, 11 Mar 2022 07:48:06 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Add assertion of PostLoopMultiversioning > - Merge branch 'master' into postloop > - Merge branch 'master' into postloop > > Change-Id: I503edb75f0f626569c776416bfef09651935979c > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Looks reasonable to me. ------------- Marked as reviewed by roland (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6828 From roland at openjdk.java.net Fri Mar 11 12:08:49 2022 From: roland at openjdk.java.net (Roland Westrelin) Date: Fri, 11 Mar 2022 12:08:49 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 02:31:08 GMT, Pengfei Li wrote: > > fairly similar. Doesn't/couldn't the logic from issue 5 protect from issue 3? > > True, the SWPointer logic can also protect from issue 3. But I believe keeping loop stride check for issue 3 has no harm and post loop vectorization can bail out earlier with this additional check. Maybe it's worth mentioning that in a comment next to that code in case someone finds the test too strict and wonder if it can be removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From fweimer at openjdk.java.net Fri Mar 11 12:21:45 2022 From: fweimer at openjdk.java.net (Florian Weimer) Date: Fri, 11 Mar 2022 12:21:45 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: <6IzXDSNVNqbNgsVAjHd_u_j61q4T5XzxbNBXp0ECjW0=.cc4cd576-90aa-4567-92c2-a40949720889@github.com> On Fri, 11 Mar 2022 09:50:22 GMT, Johannes Bechberger wrote: > According to https://forums.swift.org/t/concurrencys-use-of-thread-local-variables/48654: "these accesses are just a move from a system register plus a load/store at a constant offset." Ideally you'd still benchmark that. Some AArch64 implementations have really, really slow moves from the system register used as the thread pointer. Hopefully Apple's implementation isn't in that category. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From akozlov at openjdk.java.net Fri Mar 11 16:41:47 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 11 Mar 2022 16:41:47 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <9abtTyWumeHahJhxZnL_GX3s9_TdDAZ_e8b7OffYfoI=.c3b0b765-d9f4-4db3-bbd0-48c3598c7aa5@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <9abtTyWumeHahJhxZnL_GX3s9_TdDAZ_e8b7OffYfoI=.c3b0b765-d9f4-4db3-bbd0-48c3598c7aa5@github.com> Message-ID: <0-bZMVusZ-zdvrRi4INrZlVWrTvR31RvZgijiX2Ymv4=.f0a9ebd2-fbd0-46c1-aa55-998e641b8e78@github.com> On Thu, 10 Mar 2022 18:07:37 GMT, Thomas Stuefe wrote: > > > > Is it possible to change SafeFetch only? Switch to WXExec before calling the stub and switch WXWrite back unconditionally? We won't need to provide assert in ThreadWXEnable. But SafeFetch can check the assumption with assert via Thread, if it exists. > > > > > > > > > But SafeFetch could be used from outside code as well as VM code. In case of the latter, prior state can either be WXWrite or WXExec. It needs to restore the prior state after the call. > > > > > > I'm not sure I understand what is the "outside code". The SafeFetch is the private hotspot function, it cannot be linked with non-JVM code, isn't it? > > Sorry for being imprecise. I meant SafeFetch is triggered from within a signal handler that runs on a foreign thread. E.g. AGCT or error handling. Then the OS TLS way is not better since when the signal handler and SafeFetch start, the state is unknown and is only assumed to be Write (in initialization of TLS variable). ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From akozlov at openjdk.java.net Fri Mar 11 16:37:46 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 11 Mar 2022 16:37:46 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <2VVnQ4RiNCtAuWXQ_d-vgj-8uejqKTdAWXwxKJUNix4=.6d88041c-2332-452d-9e70-b9429940d1f0@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2VVnQ4RiNCtAuWXQ_d-vgj-8uejqKTdAWXwxKJUNix4=.6d88041c-2332-452d-9e70-b9429940d1f0@github.com> Message-ID: <9Pfhr7V3j4Op4px61CEhpa4jVwueR1wQmjLaS8l8x2g=.6404217f-0556-4f9a-b81b-d8642bb73a13@github.com> On Thu, 10 Mar 2022 18:04:50 GMT, Thomas Stuefe wrote: > blocking SIGSEGV and SIGBUS - or other synchronous error signals like SIGFPE - and then triggering said signal is UB. What happens is OS-dependent. I saw processes vanishing, or hang, or core. It makes sense, since what is the kernel supposed to do. It cannot deliver the signal, and deferring it would require returning to the faulting instruction, that would just re-fault. > For some more details see e.g. https://bugs.openjdk.java.net/browse/JDK-8252533 This UB looks reasonable. My point is that a native thread would run fine with SIGSEGV blocked. But then JVM decides it can do SafeFetch, and things gets nasty. > > Is there a crash that is fixed by the change? I just spotted it is an enhancement, not a bug. Just trying to understand the problem. > > Yes, this issue is a breakout from https://bugs.openjdk.java.net/browse/JDK-8282306, where we'd like to use SafeFetch to make stack walking in AsyncGetCallTrace more robust. AGCT is called from the signal handler, and it may run in any number of situations (e.g. in foreign threads, or threads which are in the process of getting dismantled, etc). I mean, some way to verify the issue is fixed, e.g. a test that does not fail anymore. I see AsyncGetCallTrace to assume the JavaThread very soon, or do I look at the wrong place? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/forte.cpp#L569 > Another situation is error handling itself. When writing an hs-err file, we use SafeFetch to do carefully tiptoe around the possibly corrupt VM state. If the original crash happened in a foreign thread, we still want some of these reports to work (e.g. dumping register content or printing stacks). So SafeFetch should be as robust as possible. OK, thanks. I think we also handle recursive segfaults recover after interpretation of the corrupted VM state. Otherwise, implementing the printing functions would be too tedious and hard with SafeFetch alone. But I see it's used in printing register content, at least. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dcubed at openjdk.java.net Fri Mar 11 17:03:45 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Mar 2022 17:03:45 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:35:31 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Additional tweaks requested by @kbarrett Still thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From stuefe at openjdk.java.net Fri Mar 11 17:12:45 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 17:12:45 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:35:31 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Additional tweaks requested by @kbarrett Looks good. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From mdoerr at openjdk.java.net Fri Mar 11 17:23:45 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 11 Mar 2022 17:23:45 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:35:31 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Additional tweaks requested by @kbarrett Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From akozlov at openjdk.java.net Fri Mar 11 18:02:47 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 11 Mar 2022 18:02:47 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove two unnecessary lines I looked on the patch again from the perspective of a pure refactoring. It looks fine except we lost one of the asserts. src/hotspot/share/runtime/thread.cpp line 278: > 276: } > 277: > 278: MACOS_AARCH64_ONLY(DEBUG_ONLY(os::ThreadWX::init();)) This line meant the WX state is not initialized at this point (as a part of Thread constructor). Since there are a several places where the state is initialized and it was easy to miss one, I would like to preserve some assert that the state is initialized. ------------- Changes requested by akozlov (Committer). PR: https://git.openjdk.java.net/jdk/pull/7727 From jbhateja at openjdk.java.net Fri Mar 11 19:10:09 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 11 Mar 2022 19:10:09 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v14] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Reducing the invocation count and compile thresholds for RoundTests.java. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/fcb73212..2519a58c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=12-13 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Fri Mar 11 19:10:11 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 11 Mar 2022 19:10:11 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v9] In-Reply-To: <7lwsCvdSjkvDYJNwuA7fVPrWFUbzchuwx0Z3IID5VZw=.0c00c3d7-2106-40df-88bc-38bf7e2655f9@github.com> References: <2jFjnftd7VluGsxgp8BK0vgHA68VrgGREj0fk7F6Dhk=.e40ddcaa-5a31-4115-976d-5f43e94b8ccf@github.com> <7lwsCvdSjkvDYJNwuA7fVPrWFUbzchuwx0Z3IID5VZw=.0c00c3d7-2106-40df-88bc-38bf7e2655f9@github.com> Message-ID: On Thu, 10 Mar 2022 14:29:36 GMT, Joe Darcy wrote: >> Hi @jddarcy , >> >> Test has been modified on the same lines using generic options which manipulate compilation thresholds and agnostic to target platforms. >> >> * @run main/othervm -XX:Tier3CompileThreshold=100 -XX:CompileThresholdScaling=0.01 -XX:+TieredCompilation RoundTests >> >> Verified that RoundTests::test* methods gets compiled by c2. >> Test execution time with and without change is almost same ~7.80sec over Skylake-server. >> >> Regards > > To be more explicit, the existing RoundTests.java test runs in a fraction of a second. The updated test runs many times slower, even if now under 10 second, at least on some platforms. > > Can something closer to the original performance be restored? > > As a tier 1 library test, these tests are run quite frequently. Hi @jddarcy , Earlier none of the test methods in RoundTests.java were compiled on account of low invocation count, a loop with 2000 iterations under the influence controlled compilation threshold now triggers tier4 compilation of test points. I did several runs in Skylake machine with patch and without patch and could see no perceptible difference in runtime due to modification. I have further reduced the invocation count and compile threshold. Thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From sviswanathan at openjdk.java.net Fri Mar 11 21:49:56 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Fri, 11 Mar 2022 21:49:56 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v2] In-Reply-To: <3JoM4khNMz85gwfyxZeBNxJCZ_B7826cc-iO4pHtTJM=.5b21a96e-b2f5-4093-a763-eec2b6d77a2e@github.com> References: <3JoM4khNMz85gwfyxZeBNxJCZ_B7826cc-iO4pHtTJM=.5b21a96e-b2f5-4093-a763-eec2b6d77a2e@github.com> Message-ID: On Thu, 3 Mar 2022 05:42:23 GMT, Jatin Bhateja wrote: >> The testing for this PR doesn't look adequate to me. I don't see any testing for the values where the behavior of round has been redefined at points in the last decade. See JDK-8010430 and JDK-6430675, both of which have regression tests in the core libs area. Thanks. > > Hi @jddarcy , can you kindly validate your feedback, it has been incorporated. @jatin-bhateja There is a failure reported in the Pre-submit tests on Windows x64 for compiler/vectorization/TestRoundVect.java. Could you please take a look? ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From stuefe at openjdk.java.net Fri Mar 11 23:37:41 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 23:37:41 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <9Pfhr7V3j4Op4px61CEhpa4jVwueR1wQmjLaS8l8x2g=.6404217f-0556-4f9a-b81b-d8642bb73a13@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2VVnQ4RiNCtAuWXQ_d-vgj-8uejqKTdAWXwxKJUNix4=.6d88041c-2332-452d-9e70-b9429940d1f0@github.com> <9Pfhr7V3j4Op4px61CEhpa4jVwueR1wQmjLaS8l8x2g=.6404217f-0556-4f9a-b81b-d8642bb73a13@github.com> Message-ID: On Fri, 11 Mar 2022 16:34:29 GMT, Anton Kozlov wrote: > > blocking SIGSEGV and SIGBUS - or other synchronous error signals like SIGFPE - and then triggering said signal is UB. What happens is OS-dependent. I saw processes vanishing, or hang, or core. It makes sense, since what is the kernel supposed to do. It cannot deliver the signal, and deferring it would require returning to the faulting instruction, that would just re-fault. > > For some more details see e.g. https://bugs.openjdk.java.net/browse/JDK-8252533 > > This UB looks reasonable. My point is that a native thread would run fine with SIGSEGV blocked. But then JVM decides it can do SafeFetch, and things gets nasty. Blocking synchronous error signals makes zero sense even for normal programs, since you lose the ability to get cores. For the JVM in particular, it also blocks facilities like polling pages, or dynamically querying CPU abilities. So a JVM would not even start with synchronous error signals blocked. > > > > Is there a crash that is fixed by the change? I just spotted it is an enhancement, not a bug. Just trying to understand the problem. > > > > > > Yes, this issue is a breakout from https://bugs.openjdk.java.net/browse/JDK-8282306, where we'd like to use SafeFetch to make stack walking in AsyncGetCallTrace more robust. AGCT is called from the signal handler, and it may run in any number of situations (e.g. in foreign threads, or threads that are in the process of getting dismantled, etc). > > I mean, some way to verify the issue is fixed, e.g. a test that does not fail anymore. No, tests do not exist. Unfortunately, otherwise this regression would have been detected right away and we would not need this PR. We have a test though that tests SafeFetch during error handling. That test can be tweaked for this purpose. So, test does not exist yet, but can be easily written. > > I see AsyncGetCallTrace to assume the JavaThread very soon, or do I look at the wrong place? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/forte.cpp#L569 > > > Another situation is error handling itself. When writing an hs-err file, we use SafeFetch to do carefully tiptoe around the possibly corrupt VM state. If the original crash happened in a foreign thread, we still want some of these reports to work (e.g. dumping register content or printing stacks). So SafeFetch should be as robust as possible. > > OK, thanks. I think we also handle recursive segfaults recover after interpretation of the corrupted VM state. Otherwise, implementing the printing functions would be too tedious and hard with SafeFetch alone. But I see it's used in printing register content, at least. Secondary error handling is a very coarse-grained tool. If an error reporting step crashes out, we continue with the next step. Has disadvantages though. The total number of retries is very limited. And a faulting error reporting step still hurts, because its report is compromised. E.g. if the call stack printing crashes out, we have no call stack. This is not an abstract problem. Its a very concrete and typical problem. I spend a large part of my work with hs-err reports. They are of very high importance to us. We (SAP) have invested a lot of time and effort in hardening out OpenJDK error reporting, and SafeFetch is an important part of that. For example, we provided the facility that made SafeFetch usable in signal handling. It would be nice if our work was not compromised. Please let us find a way forward here. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Fri Mar 11 23:43:47 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 11 Mar 2022 23:43:47 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 10:44:25 GMT, Thomas Stuefe wrote: > > On 3/11/22 10:12, Thomas Stuefe wrote: We could also redefine SafeFetch on MacOS/AArch64 to not need WX. We could do this by statically generating SafeFetch on that platform, and it wouldn't be in the JIT region at all. Why not just do that? Do you mean using inline assembly? > > I'd use out-of-line assembly, as I do for atomic compare-and-swap on linux: https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S But I guess inline would work. > > Oh, this is neat. It would work on all platforms too, or on all we care to implement it for. And it would nicely solve the initialization window problem since it would work before stub routines are generated. We could throw `CanUseSafeFetch` away. > > It seems we already do static assembly on bsd aarch. So there is already a path to follow. > > But this could also be done as a follow up enhancement. I still like the OS TLS variable idea. I spent some time doing a static implementation of SafeFetch on Linux x64, and its not super trivial. The problem is that we need to know addresses of instructions inside that function. I can set labels in assembly, and I can export them, but so far I have been unable to use them as addresses in C++ code. I will research some more. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jiefu at openjdk.java.net Sat Mar 12 04:07:40 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Sat, 12 Mar 2022 04:07:40 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Fri, 11 Mar 2022 09:11:20 GMT, Joshua Zhu wrote: > Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time. Nice performance improvement. Thanks for fixing it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From jzhu at openjdk.java.net Sat Mar 12 04:07:40 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Sat, 12 Mar 2022 04:07:40 GMT Subject: Integrated: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Wed, 9 Mar 2022 12:33:49 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. > Please help review. This pull request has now been integrated. Changeset: 5c408c14 Author: Joshua Zhu Committer: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/5c408c1410e15087f735a815b7edc716d514b1b3 Stats: 42 lines in 7 files changed: 0 ins; 0 del; 42 mod 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap Reviewed-by: psandoz ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From fweimer at openjdk.java.net Sat Mar 12 07:48:41 2022 From: fweimer at openjdk.java.net (Florian Weimer) Date: Sat, 12 Mar 2022 07:48:41 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 23:40:36 GMT, Thomas Stuefe wrote: > I spent some time doing a static implementation of SafeFetch on Linux x64, and its not super trivial. The problem is that we need to know addresses of instructions inside that function. I can set labels in assembly, and I can export them, but so far I have been unable to use them as addresses in C++ code. I will research some more. There are basically two ways (easy) to do it. Put global symbols like .globl address_of_label address_of_label: into the assembler sources and use ```c++ extern char address_of_label[] __attribute__ ((visibility ("hidden"))); from the C++ side. Or use a local label, and export the difference to the function start to a local label in a global data symbol from the assembler side: .globl SafeFetch // Real function name goes here. SafeFetch: // ? .Llabel: // ? .section .rodata .globl SafeFetch_label_offset .p2align 3 SafeFetch_label_offset: .quad .Llabel - SafeFetch .type SafeFetch_label_offset, @object .size SafeFetch_label_offset, 8 And use ```c++ extern uintptr_t SafeFetch_label_offset __attribute__ ((__visibility ("hidden"))); and the expression `(uintptr_t) &SafeFetch + SafeFetch_label_offset` to compute the final address. The second approach is friendlier to tools (which may get confused by symbols in the middle of functions). If you have a PR, please Cc: me on it, I will have a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jzhu at openjdk.java.net Sat Mar 12 10:34:45 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Sat, 12 Mar 2022 10:34:45 GMT Subject: RFR: 8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap In-Reply-To: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> References: <-HqocX4zJW2bQrkW_7mkitRbzXk5euq6uQQ6T-EJ5dA=.9ca581ad-e1ef-46b5-ac7e-f2fb4d9dde6e@github.com> Message-ID: On Wed, 9 Mar 2022 12:33:49 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > As discussion at https://github.com/openjdk/jdk/pull/7721 , I change the code in VectorAPI. > Please help review. Thanks Paul and FuJie. ------------- PR: https://git.openjdk.java.net/jdk/pull/7757 From jzhu at openjdk.java.net Sat Mar 12 10:37:42 2022 From: jzhu at openjdk.java.net (Joshua Zhu) Date: Sat, 12 Mar 2022 10:37:42 GMT Subject: Withdrawn: 8282722: Regard mapping array in enum switches as stable for constant folding In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 07:13:20 GMT, Joshua Zhu wrote: > I came across a performance issue when using scatter store VectorAPI for Integer and Long simultaneously in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario. > > For ScatterStore operation of LongVector.SPECIES_512/IntVector.SPECIES_512, VectorShape.S_256_BIT/S_512_BIT is the actual length of indexMap vector respectively. > > IntSpecies species(VectorShape s) > > returns the corresponding IntSpecies by Switch on Enum type "VectorShape". [1] > > With this change introduced, elements in the SwitchMap array (initialized in clinit) can be constant-folded so that determined IntSpecies can be acquired for a constant VectorShape. > > jtreg test passed without new failure. > Please help review this change and let me know if any comments. > > [1] https://github.com/openjdk/jdk/blob/894ffb098c80bfeb4209038c017d01dbf53fac0f/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/IntVector.java#L4043 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7721 From aph at openjdk.java.net Sat Mar 12 12:35:43 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 12 Mar 2022 12:35:43 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <6IzXDSNVNqbNgsVAjHd_u_j61q4T5XzxbNBXp0ECjW0=.cc4cd576-90aa-4567-92c2-a40949720889@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <6IzXDSNVNqbNgsVAjHd_u_j61q4T5XzxbNBXp0ECjW0=.cc4cd576-90aa-4567-92c2-a40949720889@github.com> Message-ID: On Fri, 11 Mar 2022 12:18:36 GMT, Florian Weimer wrote: > > According to https://forums.swift.org/t/concurrencys-use-of-thread-local-variables/48654: "these accesses are just a move from a system register plus a load/store at a constant offset." > > Ideally you'd still benchmark that. Some AArch64 implementations have really, really slow moves from the system register used as the thread pointer. Hopefully Apple's implementation isn't in that category. In a tight loop, loads from __thread variables take 1ns. It's this: 0x18ea1c530 <+0>: ldr x16, [x0, #0x8] 0x18ea1c534 <+4>: mrs x17, TPIDRRO_EL0 0x18ea1c538 <+8>: and x17, x17, #0xfffffffffffffff8 0x18ea1c53c <+12>: ldr x17, [x17, x16, lsl #3] 0x18ea1c540 <+16>: cbz x17, 0x18ea1c550 ; only executed first time 0x18ea1c544 <+20>: ldr x16, [x0, #0x10] 0x18ea1c548 <+24>: add x0, x17, x16 0x18ea1c54c <+28>: ret ... which looks the same as what glibc does. Not bad, but quite a lot more to do than a simple load. I'd still use a static SafeFetch, with no W^X fiddling. It just seems to me much more reasonable. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From aph at openjdk.java.net Sat Mar 12 12:35:43 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 12 Mar 2022 12:35:43 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Sat, 12 Mar 2022 07:45:57 GMT, Florian Weimer wrote: > into the assembler sources and use > > ```c++ > extern char address_of_label[] __attribute__ ((visibility ("hidden"))); ITYM extern "C" char address_of_label[] __attribute__ ((visibility ("hidden"))); ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From fweimer at openjdk.java.net Sat Mar 12 14:24:45 2022 From: fweimer at openjdk.java.net (Florian Weimer) Date: Sat, 12 Mar 2022 14:24:45 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: <9isICa3xUpK6uY93LAI7asBXeXJSYeqimjnlITqFLPg=.27482165-9f82-4781-93f7-1940ff53c9a5@github.com> On Sat, 12 Mar 2022 12:32:38 GMT, Andrew Haley wrote: > > into the assembler sources and use > > ```c++ > > extern char address_of_label[] __attribute__ ((visibility ("hidden"))); > > ``` > > ITYM > > ``` > extern "C" char address_of_label[] __attribute__ ((visibility ("hidden"))); It doesn't hurt, but the Itanium ABI does not mangle such global data symbols, so it's not strictly needed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dnsimon at openjdk.java.net Sat Mar 12 14:43:11 2022 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sat, 12 Mar 2022 14:43:11 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes Message-ID: [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) added abstract assembly to hs-err for methods on the stack of the crashing thread. However, it only does this if the crash is due to an unhandled signal. It can also be useful to see assembly for crashes due to failing VM assertions or guarantees. This PR implements this improvement. ------------- Commit messages: - show abstract machine code for all VM crashes Changes: https://git.openjdk.java.net/jdk/pull/7791/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7791&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283056 Stats: 4 lines in 1 file changed: 2 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7791.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7791/head:pull/7791 PR: https://git.openjdk.java.net/jdk/pull/7791 From dnsimon at openjdk.java.net Sat Mar 12 16:30:41 2022 From: dnsimon at openjdk.java.net (Doug Simon) Date: Sat, 12 Mar 2022 16:30:41 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 20:48:06 GMT, Doug Simon wrote: > [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) added abstract assembly to hs-err for methods on the stack of the crashing thread. However, it only does this if the crash is due to an unhandled signal. It can also be useful to see assembly for crashes due to failing VM assertions or guarantees. This PR implements this improvement. I've tested this manually by hacking a guarantee failure into a code path I know is called from Graal compiled code. Suggestions on how to write an automatic test that can be part of this PR are welcome. ------------- PR: https://git.openjdk.java.net/jdk/pull/7791 From aph at openjdk.java.net Sat Mar 12 17:42:43 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 12 Mar 2022 17:42:43 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <9isICa3xUpK6uY93LAI7asBXeXJSYeqimjnlITqFLPg=.27482165-9f82-4781-93f7-1940ff53c9a5@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> <9isICa3xUpK6uY93LAI7asBXeXJSYeqimjnlITqFLPg=.27482165-9f82-4781-93f7-1940ff53c9a5@github.com> Message-ID: On Sat, 12 Mar 2022 14:21:13 GMT, Florian Weimer wrote: > > ``` > > extern "C" char address_of_label[] __attribute__ ((visibility ("hidden"))); > > ``` > > It doesn't hurt, but the Itanium ABI does not mangle such global data symbols, so it's not strictly needed. That's an interesting point of view. I guess I never thought about it, but I'd always put symbols for an asm file in an `extern "C"` section anyway. But yeah, OK. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From aph at openjdk.java.net Sat Mar 12 17:48:46 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 12 Mar 2022 17:48:46 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <6IzXDSNVNqbNgsVAjHd_u_j61q4T5XzxbNBXp0ECjW0=.cc4cd576-90aa-4567-92c2-a40949720889@github.com> Message-ID: On Sat, 12 Mar 2022 12:30:39 GMT, Andrew Haley wrote: > 1ns Incidentally, there must be a lot of speculation and bypassing going on there. I can see 15 cycles of latency, probably 20, so that'd be more like 5ns start to finish. M1 is a remarkable thing. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From jbhateja at openjdk.java.net Sat Mar 12 19:58:37 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sat, 12 Mar 2022 19:58:37 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Creating separate test for round double under feature check. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/2519a58c..e4d4e29b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=14 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=13-14 Stats: 239 lines in 3 files changed: 143 ins; 96 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From dlong at openjdk.java.net Sat Mar 12 23:35:02 2022 From: dlong at openjdk.java.net (Dean Long) Date: Sat, 12 Mar 2022 23:35:02 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" Message-ID: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. ------------- Commit messages: - test large values of CodeEntryAlignment - Add extra stub space for large CodeEntryAlignment Changes: https://git.openjdk.java.net/jdk/pull/7800/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7800&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282355 Stats: 10 lines in 2 files changed: 7 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7800.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7800/head:pull/7800 PR: https://git.openjdk.java.net/jdk/pull/7800 From jiefu at openjdk.java.net Sat Mar 12 23:44:43 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Sat, 12 Mar 2022 23:44:43 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" In-Reply-To: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Sat, 12 Mar 2022 23:28:43 GMT, Dean Long wrote: > This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java line 69: > 67: > 68: public static void driver() throws IOException { > 69: for (int align = 32; align <= 1024; align *= 2) { This wouldn't test for `-XX:CodeEntryAlignment=16`. However, we used to find a bug with `-XX:CodeEntryAlignment=16`. https://github.com/openjdk/jdk/pull/7485 Why not testing with `-XX:CodeEntryAlignment=16`? ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From duke at openjdk.java.net Sun Mar 13 00:10:46 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Sun, 13 Mar 2022 00:10:46 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Sat, 12 Mar 2022 19:58:37 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Creating separate test for round double under feature check. src/hotspot/cpu/x86/assembler_x86.hpp line 1159: > 1157: void cvttsd2siq(Register dst, Address src); > 1158: void cvttsd2siq(Register dst, XMMRegister src); > 1159: void cvtsd2siq(Register dst, XMMRegister src); Hi, some small suggestions only, the instructions are sorted alphabetically so the `cvtsd2si` should come before `scttsd2si`, the same for the below instructions. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4024: > 4022: * the result is equal to the value of Integer.MAX_VALUE. > 4023: */ > 4024: void C2_MacroAssembler::vector_cast_float_special_cases_avx(XMMRegister dst, XMMRegister src, XMMRegister xtmp1, This special handling is really large, could we use a stub routine for it? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4161: > 4159: movl(scratch, 1056964608); > 4160: movq(xtmp1, scratch); > 4161: vbroadcastss(xtmp1, xtmp1, vec_enc); An `evpbroadcastd` would reduce this by one instruction I guess? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4178: > 4176: movl(scratch, 1056964608); > 4177: movq(xtmp1, scratch); > 4178: vbroadcastss(xtmp1, xtmp1, vec_enc); You could put the constant in the constant table and use `vbroadcastss` here also. Thank you very much. src/hotspot/cpu/x86/x86.ad line 7297: > 7295: ins_encode %{ > 7296: int vlen_enc = vector_length_encoding(this); > 7297: InternalAddress new_mxcsr = $constantaddress(0x3F80L); `ldmxcsr` takes a `m32` argument so this constant can be an `int` instead. Also, I would suggest putting the `mxcst_std` in the constant table also. src/hotspot/cpu/x86/x86_64.ad line 10699: > 10697: match(Set dst (ConvD2L src)); > 10698: effect(KILL cr); > 10699: format %{ "round_or_convert_d2l $dst,$src"%} You could revert the changes for `ConvD2L` and `ConvF2I` here ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From duke at openjdk.java.net Sun Mar 13 00:10:46 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Sun, 13 Mar 2022 00:10:46 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Sat, 12 Mar 2022 23:22:16 GMT, Quan Anh Mai wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Creating separate test for round double under feature check. > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4161: > >> 4159: movl(scratch, 1056964608); >> 4160: movq(xtmp1, scratch); >> 4161: vbroadcastss(xtmp1, xtmp1, vec_enc); > > An `evpbroadcastd` would reduce this by one instruction I guess? Anyway an `evpbroadcastd xmm, r` has around 5 latency on the gpr so I think you could just put the constant in the constant table and use `vbroadcastsd` ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dlong at openjdk.java.net Sun Mar 13 03:19:45 2022 From: dlong at openjdk.java.net (Dean Long) Date: Sun, 13 Mar 2022 03:19:45 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Sat, 12 Mar 2022 23:41:42 GMT, Jie Fu wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java line 69: > >> 67: >> 68: public static void driver() throws IOException { >> 69: for (int align = 32; align <= 1024; align *= 2) { > > This wouldn't test for `-XX:CodeEntryAlignment=16`. > However, we used to find a bug with `-XX:CodeEntryAlignment=16`. > https://github.com/openjdk/jdk/pull/7485 > > Why not testing with `-XX:CodeEntryAlignment=16`? The only reason is because -XX:CodeCacheSegmentSize=16 gives an error. If it's important to test for -XX:CodeEntryAlignment=16 then I'll have to rework the logic to be more clever. I'll probably have to use WhiteBox APIs to read the default CodeCacheSegmentSize and only set it to values that are >= the default. It's unfortunate that -XX:CodeEntryAlignment doesn't adjust the default value of CodeCacheSegmentSize automatically. ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From jbhateja at openjdk.java.net Sun Mar 13 04:33:46 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 13 Mar 2022 04:33:46 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Sat, 12 Mar 2022 23:20:58 GMT, Quan Anh Mai wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Creating separate test for round double under feature check. > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4024: > >> 4022: * the result is equal to the value of Integer.MAX_VALUE. >> 4023: */ >> 4024: void C2_MacroAssembler::vector_cast_float_special_cases_avx(XMMRegister dst, XMMRegister src, XMMRegister xtmp1, > > This special handling is really large, could we use a stub routine for it? Good suggestion, but as of now we are not using vector calling conventions for stubs. > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4178: > >> 4176: movl(scratch, 1056964608); >> 4177: movq(xtmp1, scratch); >> 4178: vbroadcastss(xtmp1, xtmp1, vec_enc); > > You could put the constant in the constant table and use `vbroadcastss` here also. > > Thank you very much. constant and register to register moves are never issued to execution ports, rematerializing value rather than reading from memory will give better performance. > src/hotspot/cpu/x86/x86.ad line 7297: > >> 7295: ins_encode %{ >> 7296: int vlen_enc = vector_length_encoding(this); >> 7297: InternalAddress new_mxcsr = $constantaddress(0x3F80L); > > `ldmxcsr` takes a `m32` argument so this constant can be an `int` instead. Also, I would suggest putting the `mxcst_std` in the constant table also. Correct, if we do so constant emitted will occupy 4 bytes. FTR we can also improve on the alignment padding for constants such that start address of next emitted constant aligned based on the constant size. This will be beneficial for large sized vector constants (32/64 byte) as we can save cache line split penalty during vector load. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Sun Mar 13 04:33:47 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 13 Mar 2022 04:33:47 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Sun, 13 Mar 2022 00:06:07 GMT, Quan Anh Mai wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4161: >> >>> 4159: movl(scratch, 1056964608); >>> 4160: movq(xtmp1, scratch); >>> 4161: vbroadcastss(xtmp1, xtmp1, vec_enc); >> >> An `evpbroadcastd` would reduce this by one instruction I guess? > > Anyway an `evpbroadcastd xmm, r` has around 5 latency on the gpr so I think you could just put the constant in the constant table and use `vbroadcastsd` It was done to save redundant floating point to integer domain switch over penalties. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Sun Mar 13 04:46:21 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 13 Mar 2022 04:46:21 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v16] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Styling comments resolved. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/e4d4e29b..c881d11c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=15 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=14-15 Stats: 11 lines in 3 files changed: 3 ins; 3 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Sun Mar 13 06:36:15 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 13 Mar 2022 06:36:15 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v17] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Windows build failure fix. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/c881d11c..b1323a82 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=16 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=15-16 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From jiefu at openjdk.java.net Sun Mar 13 07:15:42 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Sun, 13 Mar 2022 07:15:42 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: <2cg-lvZcZd3ZEYaXtVuCA2bF2w6WXIdIDYGnPaZwEyA=.529b568b-e8c8-4271-ae4f-f761d16421eb@github.com> On Sun, 13 Mar 2022 03:16:14 GMT, Dean Long wrote: >> test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java line 69: >> >>> 67: >>> 68: public static void driver() throws IOException { >>> 69: for (int align = 32; align <= 1024; align *= 2) { >> >> This wouldn't test for `-XX:CodeEntryAlignment=16`. >> However, we used to find a bug with `-XX:CodeEntryAlignment=16`. >> https://github.com/openjdk/jdk/pull/7485 >> >> Why not testing with `-XX:CodeEntryAlignment=16`? > > The only reason is because -XX:CodeCacheSegmentSize=16 gives an error. If it's important to test for -XX:CodeEntryAlignment=16 then I'll have to rework the logic to be more clever. I'll probably have to use WhiteBox APIs to read the default CodeCacheSegmentSize and only set it to values that are >= the default. It's unfortunate that -XX:CodeEntryAlignment doesn't adjust the default value of CodeCacheSegmentSize automatically. Maybe we can add testing for `-XX:CodeEntryAlignment={512, 1024}` like this. diff --git a/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java b/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java index fd6c8ca..f07030e 100644 --- a/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java +++ b/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java @@ -72,6 +72,14 @@ public class TestCodeEntryAlignment { "-XX:CodeEntryAlignment=" + align ); } + + for (int align = 512; align < 1024; align *= 2) { + shouldPass( + "-XX:+UnlockExperimentalVMOptions", + "-XX:CodeCacheSegmentSize=" + align, + "-XX:CodeEntryAlignment=" + align + ); + } } } What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From duke at openjdk.java.net Sun Mar 13 16:59:43 2022 From: duke at openjdk.java.net (hakib1) Date: Sun, 13 Mar 2022 16:59:43 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" In-Reply-To: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: <-H86iYx9-94GUkEwMoVkznxVz1MzMUux_7EE5m9S4uE=.b57eb99d-0762-4f4b-9c84-77319e4ba06b@github.com> On Sat, 12 Mar 2022 23:28:43 GMT, Dean Long wrote: > This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. Marked as reviewed by hakib1 at github.com (no known OpenJDK username). ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From kbarrett at openjdk.java.net Sun Mar 13 18:40:42 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 13 Mar 2022 18:40:42 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 18:39:33 GMT, Kim Barrett wrote: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Response to this proposal has been minimal so far. I probably should have mentioned that using unrestricted unions involves more than just wrapping several members of non-trivial type in a `union` form. There are important effects on some of the special member functions for the union/class containing the variant members. See the proposal or the Standard for details. The short form is that if a variant-member has a non-trivial ctor/dtor/assign then the corresponding implicit definition of the enclosing class/union is deleted; there must be a user-provided definition to use that special function. Also, if a variant member is not mentioned in a constructor's mem-initializer-list then it is not initialized, rather than implicitly default initialized. The idea is that the compiler can't know how to write the correct "default" version of the function, so it must be provided explicitly. ------------- PR: https://git.openjdk.java.net/jdk/pull/7704 From dlong at openjdk.java.net Sun Mar 13 20:28:18 2022 From: dlong at openjdk.java.net (Dean Long) Date: Sun, 13 Mar 2022 20:28:18 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v2] In-Reply-To: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: > This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. Dean Long has updated the pull request incrementally with one additional commit since the last revision: improvement based on review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7800/files - new: https://git.openjdk.java.net/jdk/pull/7800/files/65fa6fda..e72382ef Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7800&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7800&range=00-01 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7800.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7800/head:pull/7800 PR: https://git.openjdk.java.net/jdk/pull/7800 From dlong at openjdk.java.net Sun Mar 13 20:28:20 2022 From: dlong at openjdk.java.net (Dean Long) Date: Sun, 13 Mar 2022 20:28:20 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v2] In-Reply-To: <2cg-lvZcZd3ZEYaXtVuCA2bF2w6WXIdIDYGnPaZwEyA=.529b568b-e8c8-4271-ae4f-f761d16421eb@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> <2cg-lvZcZd3ZEYaXtVuCA2bF2w6WXIdIDYGnPaZwEyA=.529b568b-e8c8-4271-ae4f-f761d16421eb@github.com> Message-ID: On Sun, 13 Mar 2022 07:12:57 GMT, Jie Fu wrote: >> The only reason is because -XX:CodeCacheSegmentSize=16 gives an error. If it's important to test for -XX:CodeEntryAlignment=16 then I'll have to rework the logic to be more clever. I'll probably have to use WhiteBox APIs to read the default CodeCacheSegmentSize and only set it to values that are >= the default. It's unfortunate that -XX:CodeEntryAlignment doesn't adjust the default value of CodeCacheSegmentSize automatically. > > Maybe we can add testing for `-XX:CodeEntryAlignment={512, 1024}` like this. > > diff --git a/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java b/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java > index fd6c8ca..f07030e 100644 > --- a/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java > +++ b/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java > @@ -72,6 +72,14 @@ public class TestCodeEntryAlignment { > "-XX:CodeEntryAlignment=" + align > ); > } > + > + for (int align = 512; align < 1024; align *= 2) { > + shouldPass( > + "-XX:+UnlockExperimentalVMOptions", > + "-XX:CodeCacheSegmentSize=" + align, > + "-XX:CodeEntryAlignment=" + align > + ); > + } > } > > } > > What do you think? That sounds good. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From jiefu at openjdk.java.net Mon Mar 14 00:18:36 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Mar 2022 00:18:36 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v2] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Sun, 13 Mar 2022 20:28:18 GMT, Dean Long wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > improvement based on review src/hotspot/share/runtime/stubRoutines.cpp line 219: > 217: TraceTime timer("StubRoutines generation 1", TRACETIME_LOG(Info, startuptime)); > 218: // Add extra space for large CodeEntryAlignment > 219: int max_stubs = 10; Thanks for your update. There are about 30 stubs (max number of stubs) to be generated in `generate_initial()` for x86_64. Why not `max_stubs = 30` ? ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From thartmann at openjdk.java.net Mon Mar 14 06:06:38 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 14 Mar 2022 06:06:38 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes In-Reply-To: References: Message-ID: On Sat, 12 Mar 2022 16:27:26 GMT, Doug Simon wrote: > Suggestions on how to write an automatic test that can be part of this PR are welcome. What about using `-XX:CICrashAt` and other flags that trigger asserts? You can have a look at `TestDwarf.java` in Christian's PR https://github.com/openjdk/jdk/pull/7126. ------------- PR: https://git.openjdk.java.net/jdk/pull/7791 From pli at openjdk.java.net Mon Mar 14 06:13:30 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Mon, 14 Mar 2022 06:13:30 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v6] In-Reply-To: References: Message-ID: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> > ### Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ### Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > **I) C2 crashes with segmentation fault in strip-mined loops** > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > **II) Incorrect result issues with post loop vectorization** > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > - **[Issue-1] Incorrect vectorization for partial vectorizable loops** > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > - **[Issue-2] Incorrect result in loops with growing-down vectors** > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > - **[Issue-3] Incorrect result in manually unrolled loops** > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > - **[Issue-4] Incorrect result in loops with mixed vector element sizes** > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > - **[Issue-5] Incorrect result in loops with potential data dependence** > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ### Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ### Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: Update a few comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6828/files - new: https://git.openjdk.java.net/jdk/pull/6828/files/e150e056..4442e002 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/6828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6828/head:pull/6828 PR: https://git.openjdk.java.net/jdk/pull/6828 From pli at openjdk.java.net Mon Mar 14 07:08:48 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Mon, 14 Mar 2022 07:08:48 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 12:06:45 GMT, Roland Westrelin wrote: > Maybe it's worth mentioning that in a comment next to that code in case someone finds the test too strict and wonder if it can be removed. The comment there is updated. Thanks for suggestion! ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From pli at openjdk.java.net Mon Mar 14 07:12:51 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Mon, 14 Mar 2022 07:12:51 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4] In-Reply-To: References: Message-ID: On Thu, 10 Mar 2022 12:27:10 GMT, Roland Westrelin wrote: > Another review (ideally by someone familiar with the superword code) is required. Can I have another review from Oracle? Perhaps @chhagedorn or @TobiHartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From stuefe at openjdk.java.net Mon Mar 14 08:06:46 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 14 Mar 2022 08:06:46 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove two unnecessary lines Hi Florian, > If you have a PR, please Cc: me on it, I will have a look. Thanks a lot, Florian! I got it to work under Linux x64. My error was that I had declared the label in C++ as `extern void* SafeFetch_continuation`. Declaring it as `extern char _SafeFetch32_continuation[] __attribute__ ((visibility ("hidden")));` as you suggested does the trick. I'm not sure I understand the difference. >> extern "C" > It doesn't hurt, but the Itanium ABI does not mangle such global data symbols, so it's not strictly needed. I don't understand this remark, what does Itanium have to do with this? ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From fweimer at openjdk.java.net Mon Mar 14 08:22:53 2022 From: fweimer at openjdk.java.net (Florian Weimer) Date: Mon, 14 Mar 2022 08:22:53 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Mon, 14 Mar 2022 08:03:39 GMT, Thomas Stuefe wrote: > Thanks a lot, Florian! I got it to work under Linux x64. Great! > My error was that I had declared the label in C++ as `extern void* SafeFetch_continuation`. Declaring it as `extern char _SafeFetch32_continuation[] __attribute__ ((visibility ("hidden")));` as you suggested does the trick. I'm not sure I understand the difference. Your approach might have worked as well, but you would have to use `&SafeFetch_continuation` on the C++ side. Arrays work directly because of pointer decay. The actual type does not matter because you just want to create a code address from that, so there's no corresponding object (in the C++ standard sense) at the address anyway. Anyway, from what I've seen, the array is more idiomatic. > > It doesn't hurt, but the Itanium ABI does not mangle such global data symbols, so it's not strictly needed. > > I don't understand this remark, what does Itanium have to do with this? The [C++ ABI definition](https://github.com/itanium-cxx-abi/cxx-abi) is probably Itanium's most lasting contribution to computing. I think it's used on most non-Windows systems these days, not just on Linux, and of course on all kinds of CPUs. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dlong at openjdk.java.net Mon Mar 14 08:34:40 2022 From: dlong at openjdk.java.net (Dean Long) Date: Mon, 14 Mar 2022 08:34:40 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v2] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: <3RrzOdqAGvNLzvV0BlpC_Fbv_bquznXD99rgai_sgjk=.17228dee-226c-4684-be69-af927c6aa9ad@github.com> On Mon, 14 Mar 2022 00:15:20 GMT, Jie Fu wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> improvement based on review > > src/hotspot/share/runtime/stubRoutines.cpp line 219: > >> 217: TraceTime timer("StubRoutines generation 1", TRACETIME_LOG(Info, startuptime)); >> 218: // Add extra space for large CodeEntryAlignment >> 219: int max_stubs = 10; > > Thanks for your update. > > There are about 30 stubs (max number of stubs) to be generated in `generate_initial()` for x86_64. > Why not `max_stubs = 30` ? I instrumented the align() call in my private build to count how many used align(CodeEntryAlignment), and I counted only 7, then I rounded up to 10. Maybe I should change the name to max_aligned_stubs to make it more clear? ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From jiefu at openjdk.java.net Mon Mar 14 09:01:49 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Mar 2022 09:01:49 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v2] In-Reply-To: <3RrzOdqAGvNLzvV0BlpC_Fbv_bquznXD99rgai_sgjk=.17228dee-226c-4684-be69-af927c6aa9ad@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> <3RrzOdqAGvNLzvV0BlpC_Fbv_bquznXD99rgai_sgjk=.17228dee-226c-4684-be69-af927c6aa9ad@github.com> Message-ID: On Mon, 14 Mar 2022 08:31:46 GMT, Dean Long wrote: >> src/hotspot/share/runtime/stubRoutines.cpp line 219: >> >>> 217: TraceTime timer("StubRoutines generation 1", TRACETIME_LOG(Info, startuptime)); >>> 218: // Add extra space for large CodeEntryAlignment >>> 219: int max_stubs = 10; >> >> Thanks for your update. >> >> There are about 30 stubs (max number of stubs) to be generated in `generate_initial()` for x86_64. >> Why not `max_stubs = 30` ? > > I instrumented the align() call in my private build to count how many used align(CodeEntryAlignment), and I counted only 7, then I rounded up to 10. Maybe I should change the name to max_aligned_stubs to make it more clear? `max_aligned_stubs` is fine to me. Please also update the copyright year. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From stuefe at openjdk.java.net Mon Mar 14 09:03:51 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 14 Mar 2022 09:03:51 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: <4dPZaM0fgE5DAE58yLG9Zbicg-QWgN5Ikalkf4vFUE8=.6c45e3cb-c4f5-4e6e-abef-4c950e2cb41f@github.com> On Mon, 14 Mar 2022 08:19:41 GMT, Florian Weimer wrote: > > Thanks a lot, Florian! I got it to work under Linux x64. > > Great! > > > My error was that I had declared the label in C++ as `extern void* SafeFetch_continuation`. Declaring it as `extern char _SafeFetch32_continuation[] __attribute__ ((visibility ("hidden")));` as you suggested does the trick. I'm not sure I understand the difference. > > Your approach might have worked as well, but you would have to use `&SafeFetch_continuation` on the C++ side. Arrays work directly because of pointer decay. Ah, that makes sense. I wondered why the address did not look like a code pointer in C++. Anyway, got Linux x86_32 working too. Now I am working on aarch64. > > Anyway, from what I've seen, the array is more idiomatic. > > > > It doesn't hurt, but the Itanium ABI does not mangle such global data symbols, so it's not strictly needed. > > > > > > I don't understand this remark, what does Itanium have to do with this? > > The [C++ ABI definition](https://github.com/itanium-cxx-abi/cxx-abi) is probably Itanium's most lasting contribution to computing. I think it's used on most non-Windows systems these days, not just on Linux, and of course on all kinds of CPUs. Interesting to know. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From aph at openjdk.java.net Mon Mar 14 09:32:50 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 14 Mar 2022 09:32:50 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Sun, 13 Mar 2022 04:27:25 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4024: >> >>> 4022: * the result is equal to the value of Integer.MAX_VALUE. >>> 4023: */ >>> 4024: void C2_MacroAssembler::vector_cast_float_special_cases_avx(XMMRegister dst, XMMRegister src, XMMRegister xtmp1, >> >> This special handling is really large, could we use a stub routine for it? > > Good suggestion, but as of now we are not using vector calling conventions for stubs. I don't understand this comment. If the stub is only to be used by you, then you can determine your own calling convention. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From tschatzl at openjdk.java.net Mon Mar 14 10:14:38 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 14 Mar 2022 10:14:38 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 18:39:33 GMT, Kim Barrett wrote: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7704 From duke at openjdk.java.net Mon Mar 14 10:20:46 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 14 Mar 2022 10:20:46 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove two unnecessary lines We're looking into solutions and create a new PR if necessary. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Mon Mar 14 10:20:46 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 14 Mar 2022 10:20:46 GMT Subject: Withdrawn: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Mon, 7 Mar 2022 11:29:08 GMT, Johannes Bechberger wrote: > The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. > This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From thartmann at openjdk.java.net Mon Mar 14 10:38:57 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 14 Mar 2022 10:38:57 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v17] In-Reply-To: References: Message-ID: <1J7RFTiEF7VfaEg4EF29Hwd9UUU0D1MM1xh6waG3ulY=.251d7fd9-0d1d-4288-9a55-6feca4b0ec6a@github.com> On Sun, 13 Mar 2022 06:36:15 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Windows build failure fix. `compiler/c2/cr6340864/TestFloatVect.java` and `TestDoubleVect.java` fail on Windows: # A fatal error has been detected by the Java Runtime Environment: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000001971b940123, pid=56524, tid=57368 # # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 19-internal-2022-03-14-0834080.tobias.hartmann.jdk2) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 19-internal-2022-03-14-0834080.tobias.hartmann.jdk2, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64) # Problematic frame: # J 205 c2 compiler.c2.cr6340864.TestFloatVect.test_round([I[F)V (24 bytes) @ 0x000001971b940123 [0x000001971b93ffe0+0x0000000000000143] ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From duke at openjdk.java.net Mon Mar 14 11:28:33 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 14 Mar 2022 11:28:33 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v13] In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: > This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method > and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Add check for Thread to CanUseSafefetch ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7591/files - new: https://git.openjdk.java.net/jdk/pull/7591/files/219837e3..555df5ae Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=11-12 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7591.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591 PR: https://git.openjdk.java.net/jdk/pull/7591 From jbhateja at openjdk.java.net Mon Mar 14 12:11:49 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 14 Mar 2022 12:11:49 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: <8IP7JMUqOWwCVaV3-hf42ovowPNRAEDQRrkzLf-z2yg=.5f4f5350-711b-411d-bcb5-45911fd901d7@github.com> On Mon, 14 Mar 2022 09:29:28 GMT, Andrew Haley wrote: >> Good suggestion, but as of now we are not using vector calling conventions for stubs. > > I don't understand this comment. If the stub is only to be used by you, then you can determine your own calling convention. We are passing mixture of scalar, vector and opmask register to special handling function, only way we can pass them reliably to callee stub without having an elaborate mixed calling convention will be by bounding the machine operands. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From duke at openjdk.java.net Mon Mar 14 11:28:34 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 14 Mar 2022 11:28:34 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v12] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: On Mon, 28 Feb 2022 16:28:27 GMT, Johannes Bechberger wrote: >> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method >> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix trailing whitespace The SafeFetch PR is more work. I modified the CanUseSafeFetch methods. This should fix the tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From dnsimon at openjdk.java.net Mon Mar 14 14:05:50 2022 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 14 Mar 2022 14:05:50 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes In-Reply-To: References: Message-ID: On Mon, 14 Mar 2022 06:03:43 GMT, Tobias Hartmann wrote: > What about using -XX:CICrashAt and other flags that trigger asserts `CICrashAt` triggers an assertion during compilation where as I need an assertion or guarantee in VM code called from JIT compiled code. To be completely reliable, this needs a runtime entry point to `assert`, `guarantee` or `fatal` that can be called from compiled code. JVMCI has such an entry point for `fatal` (`JVMCIRuntime::vm_message`) but without Graal in the JDK, it's hard to write a test that uses it. I've done such testing manually so am confident that it works. ------------- PR: https://git.openjdk.java.net/jdk/pull/7791 From hseigel at openjdk.java.net Mon Mar 14 15:24:45 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 14 Mar 2022 15:24:45 GMT Subject: RFR: 8282881: Print exception message in VM crash with -XX:AbortVMOnException In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 16:04:47 GMT, Emanuel Peter wrote: > In `Exceptions::debug_check_abort`, we crash the VM if the exception matches with `-XX:AbortVMOnException`. For example `-XX:AbortVMOnException=java.lang.RuntimeEx`. > > Currently, in the VM crash description, we only print the exception name (`value_string`), and not its message (`message`). For completeness and consistency, we should also print the exception message. > > I tested it with these two exceptions, the first results in `message` being `NULL`: > `throw new RuntimeException();` > `throw new RuntimeException("some message");` > > Running tests to make sure nothing else broke. Looks good! Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7762 From duke at openjdk.java.net Mon Mar 14 16:25:45 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Mon, 14 Mar 2022 16:25:45 GMT Subject: RFR: 8282881: Print exception message in VM crash with -XX:AbortVMOnException In-Reply-To: References: Message-ID: <2E-51CjqaeyuUR4xPFIRJGmS1gveA4wJyaZBskrD1Tg=.b7e9bdcb-1747-4b08-97a9-8d9c9d64a2e4@github.com> On Mon, 14 Mar 2022 15:21:31 GMT, Harold Seigel wrote: >> In `Exceptions::debug_check_abort`, we crash the VM if the exception matches with `-XX:AbortVMOnException`. For example `-XX:AbortVMOnException=java.lang.RuntimeEx`. >> >> Currently, in the VM crash description, we only print the exception name (`value_string`), and not its message (`message`). For completeness and consistency, we should also print the exception message. >> >> I tested it with these two exceptions, the first results in `message` being `NULL`: >> `throw new RuntimeException();` >> `throw new RuntimeException("some message");` >> >> Running tests to make sure nothing else broke. > > Looks good! > Thanks, Harold Thanks @hseigel and @dholmes-ora for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/7762 From duke at openjdk.java.net Mon Mar 14 17:38:49 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Mon, 14 Mar 2022 17:38:49 GMT Subject: Integrated: 8282881: Print exception message in VM crash with -XX:AbortVMOnException In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 16:04:47 GMT, Emanuel Peter wrote: > In `Exceptions::debug_check_abort`, we crash the VM if the exception matches with `-XX:AbortVMOnException`. For example `-XX:AbortVMOnException=java.lang.RuntimeEx`. > > Currently, in the VM crash description, we only print the exception name (`value_string`), and not its message (`message`). For completeness and consistency, we should also print the exception message. > > I tested it with these two exceptions, the first results in `message` being `NULL`: > `throw new RuntimeException();` > `throw new RuntimeException("some message");` > > Running tests to make sure nothing else broke. This pull request has now been integrated. Changeset: 7833667f Author: Emanuel Peter Committer: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/7833667f0e2151fc56c7c1533015f004f02f7ab2 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod 8282881: Print exception message in VM crash with -XX:AbortVMOnException Reviewed-by: dholmes, hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/7762 From redestad at openjdk.java.net Mon Mar 14 19:44:47 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Mon, 14 Mar 2022 19:44:47 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v15] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 23:59:32 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright year in new test Gentle reminder that I need a review of the aarch64 changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From rriggs at openjdk.java.net Mon Mar 14 20:33:41 2022 From: rriggs at openjdk.java.net (Roger Riggs) Date: Mon, 14 Mar 2022 20:33:41 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v15] In-Reply-To: References: Message-ID: On Wed, 9 Mar 2022 23:59:32 GMT, Claes Redestad wrote: >> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). > > Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright year in new test core libs String.java changes look fine. ------------- Marked as reviewed by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7231 From ccheung at openjdk.java.net Mon Mar 14 22:11:45 2022 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 14 Mar 2022 22:11:45 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v6] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:55:23 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Added helper function CollectedHeap::zap_filler_array_with CDS changes look good. One minor comment on a test. test/hotspot/jtreg/runtime/cds/appcds/javaldr/LockDuringDumpAgent.java line 65: > 63: if (elapsed >= timeout) { > 64: System.out.println("This JVM may decide to not launch any Java threads during -Xshare:dump."); > 65: System.out.println("This is OK because no string objects be in a locked state during heap dump."); Should `no string objects be` be `no string objects could be`? ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7748 From dlong at openjdk.java.net Tue Mar 15 00:26:14 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 15 Mar 2022 00:26:14 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: > This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. Dean Long has updated the pull request incrementally with one additional commit since the last revision: rename max_stubs --> max_aligned_stubs, update copyright year ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7800/files - new: https://git.openjdk.java.net/jdk/pull/7800/files/e72382ef..89f9c6c4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7800&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7800&range=01-02 Stats: 6 lines in 2 files changed: 1 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7800.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7800/head:pull/7800 PR: https://git.openjdk.java.net/jdk/pull/7800 From jiefu at openjdk.java.net Tue Mar 15 01:15:45 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Tue, 15 Mar 2022 01:15:45 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Tue, 15 Mar 2022 00:26:14 GMT, Dean Long wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > rename max_stubs --> max_aligned_stubs, update copyright year Marked as reviewed by jiefu (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From xgong at openjdk.java.net Tue Mar 15 01:18:46 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 15 Mar 2022 01:18:46 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:29:22 GMT, Xiaohong Gong wrote: > The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. > And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. > > To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: > - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. > - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. > > So with this patch, the following transformations are applied: > > For non-masked negation with NEON: > > movi v16.4s, #0x0 > sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s > > and with SVE: > > mov z16.s, #0 > sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s > > For masked negation with NEON: > > movi v17.4s, #0x1 > mvn v19.16b, v18.16b > mov v20.16b, v16.16b ==> neg v18.4s, v17.4s > bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b > add v19.4s, v20.4s, v17.4s > mov v18.16b, v16.16b > bsl v18.16b, v19.16b, v20.16b > > and with SVE: > > mov z16.s, #-1 > mov z17.s, #1 ==> neg z16.s, p0/m, z16.s > eor z18.s, p0/m, z18.s, z16.s > add z18.s, p0/m, z18.s, z17.s > > Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): > > NEON: > Benchmark Gain > Byte128Vector.NEG 1.029 > Byte128Vector.NEGMasked 1.757 > Short128Vector.NEG 1.041 > Short128Vector.NEGMasked 1.659 > Int128Vector.NEG 1.005 > Int128Vector.NEGMasked 1.513 > Long128Vector.NEG 1.003 > Long128Vector.NEGMasked 1.878 > > SVE with 512-bits: > Benchmark Gain > ByteMaxVector.NEG 1.10 > ByteMaxVector.NEGMasked 1.165 > ShortMaxVector.NEG 1.056 > ShortMaxVector.NEGMasked 1.195 > IntMaxVector.NEG 1.002 > IntMaxVector.NEGMasked 1.239 > LongMaxVector.NEG 1.031 > LongMaxVector.NEGMasked 1.191 > > X86 (non AVX-512): > Benchmark Gain > ByteMaxVector.NEGMasked 1.254 > ShortMaxVector.NEGMasked 1.359 > IntMaxVector.NEGMasked 1.431 > LongMaxVector.NEGMasked 1.989 > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 Hi, could anyone please take a look at this PR? Thanks so much! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From darcy at openjdk.java.net Tue Mar 15 02:42:40 2022 From: darcy at openjdk.java.net (Joe Darcy) Date: Tue, 15 Mar 2022 02:42:40 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: Message-ID: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> On Fri, 11 Mar 2022 06:29:22 GMT, Xiaohong Gong wrote: > The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. > And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. > > To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: > - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. > - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. > > So with this patch, the following transformations are applied: > > For non-masked negation with NEON: > > movi v16.4s, #0x0 > sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s > > and with SVE: > > mov z16.s, #0 > sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s > > For masked negation with NEON: > > movi v17.4s, #0x1 > mvn v19.16b, v18.16b > mov v20.16b, v16.16b ==> neg v18.4s, v17.4s > bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b > add v19.4s, v20.4s, v17.4s > mov v18.16b, v16.16b > bsl v18.16b, v19.16b, v20.16b > > and with SVE: > > mov z16.s, #-1 > mov z17.s, #1 ==> neg z16.s, p0/m, z16.s > eor z18.s, p0/m, z18.s, z16.s > add z18.s, p0/m, z18.s, z17.s > > Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): > > NEON: > Benchmark Gain > Byte128Vector.NEG 1.029 > Byte128Vector.NEGMasked 1.757 > Short128Vector.NEG 1.041 > Short128Vector.NEGMasked 1.659 > Int128Vector.NEG 1.005 > Int128Vector.NEGMasked 1.513 > Long128Vector.NEG 1.003 > Long128Vector.NEGMasked 1.878 > > SVE with 512-bits: > Benchmark Gain > ByteMaxVector.NEG 1.10 > ByteMaxVector.NEGMasked 1.165 > ShortMaxVector.NEG 1.056 > ShortMaxVector.NEGMasked 1.195 > IntMaxVector.NEG 1.002 > IntMaxVector.NEGMasked 1.239 > LongMaxVector.NEG 1.031 > LongMaxVector.NEGMasked 1.191 > > X86 (non AVX-512): > Benchmark Gain > ByteMaxVector.NEGMasked 1.254 > ShortMaxVector.NEGMasked 1.359 > IntMaxVector.NEGMasked 1.431 > LongMaxVector.NEGMasked 1.989 > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: double negate(double arg) {return -0.0 - arg; } This is to handle signed zeros correctly. ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From xgong at openjdk.java.net Tue Mar 15 02:50:43 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 15 Mar 2022 02:50:43 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> References: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> Message-ID: On Tue, 15 Mar 2022 02:39:42 GMT, Joe Darcy wrote: > Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: > > double negate(double arg) {return -0.0 - arg; } > > This is to handle signed zeros correctly. Hi @jddarcy ,thanks for looking at this PR and thanks for the notes on the floating point negation! Yeah, this really makes sense to me. Kindly note that this patch didn't touch the negation of the floating point values. For Vector API, the vector floating point negation has been intrinsified to `NegVF/D` node by compiler that we directly generate the negation instructions for them. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From dlong at openjdk.java.net Tue Mar 15 02:57:40 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 15 Mar 2022 02:57:40 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Tue, 15 Mar 2022 01:12:01 GMT, Jie Fu wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> rename max_stubs --> max_aligned_stubs, update copyright year > > Marked as reviewed by jiefu (Reviewer). Thanks @DamonFool and @hakib1. ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From dlong at openjdk.java.net Tue Mar 15 02:57:40 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 15 Mar 2022 02:57:40 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: <-H86iYx9-94GUkEwMoVkznxVz1MzMUux_7EE5m9S4uE=.b57eb99d-0762-4f4b-9c84-77319e4ba06b@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> <-H86iYx9-94GUkEwMoVkznxVz1MzMUux_7EE5m9S4uE=.b57eb99d-0762-4f4b-9c84-77319e4ba06b@github.com> Message-ID: <-NYkHzgbUhl771GY8hBEp8XQvBxvxrNkiOeinPE2Pl8=.12cb8a7e-82de-44bd-abce-9d7432c90e78@github.com> On Sun, 13 Mar 2022 16:56:41 GMT, hakib1 wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> rename max_stubs --> max_aligned_stubs, update copyright year > > Marked as reviewed by hakib1 at github.com (no known OpenJDK username). @hakib1, do you have an openjdk username? ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From dlong at openjdk.java.net Tue Mar 15 03:01:43 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 15 Mar 2022 03:01:43 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: <2PQIVkju1Afg84mJHIUwmaSFIFDiK9RjDxdwev6UZww=.907b556f-a4eb-42ff-8d12-af54453ede3b@github.com> On Tue, 15 Mar 2022 00:26:14 GMT, Dean Long wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > rename max_stubs --> max_aligned_stubs, update copyright year @shipilev, can I get a review from you? ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From stuefe at openjdk.java.net Tue Mar 15 06:09:42 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 15 Mar 2022 06:09:42 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v13] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: <7wFKmyOltY-k4JZz1bqNXUsTJlciBq7bHMuSsddIheI=.d905592e-7f34-47d2-8699-da8ca674d5e6@github.com> On Mon, 14 Mar 2022 11:28:33 GMT, Johannes Bechberger wrote: >> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method >> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Add check for Thread to CanUseSafefetch Looks good to me. If you add the comment about JDK-8282475 (see inline remark) this is fine. Cheers, Thomas src/hotspot/share/runtime/safefetch.inline.hpp line 57: > 55: // returns true if SafeFetch32 and SafeFetchN can be used safely (stubroutines are already generated) > 56: inline bool CanUseSafeFetch32() { > 57: #if defined (__APPLE__) && defined(AARCH64) Pls in front of this __APPLE__ section and the one below add a comment like `// workaround for JDK-8282475`. That will remind us to remove this again once that bug is fixed. src/hotspot/share/runtime/safefetch.inline.hpp line 61: > 59: return false; > 60: } > 61: #endif // __APPLE__ && AARCH64 Totally valid way to work around JDK-8282475, but note that `os::is_readable_pointer()` defaults to true for `CanSafeFetch=false` (see https://github.com/openjdk/jdk/blob/6013d09e82693a1c07cf0bf6daffd95114b3cbfa/src/hotspot/share/runtime/os.cpp#L1042-L1046) The story behind that is that there are more use cases of `is_readable_memory()` where - if one cannot use SafeFetch - it makes sense to optimistically assume the location is readable. E.g. in error handling, where secondary crashes are annoying but not deadly. The contract is that if you really care about this, use CanUseSafeFetch beforehand (usually it was fine to ignore it since CanUseSafeFetch only affected a small time window during VM initialization). With your patch, it means that now os::is_readable_memory always returns true if Thread::current is NULL. So in that case, we bypass your protection. I think that is acceptable for now. Once JDK-8282475 is solved we can remove this again. I wondered whether this affects your gtest test case, but in that test case Thread::current is not NULL because its a TEST_VM case, so we are in the VM. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7591 From duke at openjdk.java.net Tue Mar 15 07:15:45 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 15 Mar 2022 07:15:45 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v13] In-Reply-To: <7wFKmyOltY-k4JZz1bqNXUsTJlciBq7bHMuSsddIheI=.d905592e-7f34-47d2-8699-da8ca674d5e6@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> <7wFKmyOltY-k4JZz1bqNXUsTJlciBq7bHMuSsddIheI=.d905592e-7f34-47d2-8699-da8ca674d5e6@github.com> Message-ID: On Tue, 15 Mar 2022 06:01:27 GMT, Thomas Stuefe wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Add check for Thread to CanUseSafefetch > > src/hotspot/share/runtime/safefetch.inline.hpp line 57: > >> 55: // returns true if SafeFetch32 and SafeFetchN can be used safely (stubroutines are already generated) >> 56: inline bool CanUseSafeFetch32() { >> 57: #if defined (__APPLE__) && defined(AARCH64) > > Pls in front of this __APPLE__ section and the one below add a comment like `// workaround for JDK-8282475`. That will remind us to remove this again once that bug is fixed. You're right, I forgot to add this comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From duke at openjdk.java.net Tue Mar 15 07:54:23 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 15 Mar 2022 07:54:23 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v14] In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: > This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method > and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Add workaround comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7591/files - new: https://git.openjdk.java.net/jdk/pull/7591/files/555df5ae..7e8aff34 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=12-13 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7591.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591 PR: https://git.openjdk.java.net/jdk/pull/7591 From duke at openjdk.java.net Tue Mar 15 07:54:26 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Tue, 15 Mar 2022 07:54:26 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v13] In-Reply-To: <7wFKmyOltY-k4JZz1bqNXUsTJlciBq7bHMuSsddIheI=.d905592e-7f34-47d2-8699-da8ca674d5e6@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> <7wFKmyOltY-k4JZz1bqNXUsTJlciBq7bHMuSsddIheI=.d905592e-7f34-47d2-8699-da8ca674d5e6@github.com> Message-ID: On Tue, 15 Mar 2022 06:00:13 GMT, Thomas Stuefe wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Add check for Thread to CanUseSafefetch > > src/hotspot/share/runtime/safefetch.inline.hpp line 61: > >> 59: return false; >> 60: } >> 61: #endif // __APPLE__ && AARCH64 > > Totally valid way to work around JDK-8282475, but note that `os::is_readable_pointer()` defaults to true for `CanSafeFetch=false` (see https://github.com/openjdk/jdk/blob/6013d09e82693a1c07cf0bf6daffd95114b3cbfa/src/hotspot/share/runtime/os.cpp#L1042-L1046) > > The story behind that is that there are more use cases of `is_readable_memory()` where - if one cannot use SafeFetch - it makes sense to optimistically assume the location is readable. E.g. in error handling, where secondary crashes are annoying but not deadly. The contract is that if you really care about this, use CanUseSafeFetch beforehand (usually it was fine to ignore it since CanUseSafeFetch only affected a small time window during VM initialization). > > With your patch, it means that now os::is_readable_memory always returns true if Thread::current is NULL. So in that case, we bypass your protection. I think that is acceptable for now. Once JDK-8282475 is solved we can remove this again. > > I wondered whether this affects your gtest test case, but in that test case Thread::current is not NULL because its a TEST_VM case, so we are in the VM. I could add a non VM test case? Or is it to difficult? ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From iklam at openjdk.java.net Tue Mar 15 08:17:24 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 15 Mar 2022 08:17:24 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v7] In-Reply-To: References: Message-ID: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - fixed copyright - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2 - @calvinccheung review: fixed typo - Added helper function CollectedHeap::zap_filler_array_with - @kimbarrett comments - zero GC heap filler arrays - improvement zeroing of alignment gaps - Fixed zero build - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2 - fixed test - ... and 2 more: https://git.openjdk.java.net/jdk/compare/a6344ded...cd934f3c ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/47e0238a..cd934f3c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=05-06 Stats: 6804 lines in 210 files changed: 3347 ins; 1802 del; 1655 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From iklam at openjdk.java.net Tue Mar 15 08:18:56 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 15 Mar 2022 08:18:56 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v6] In-Reply-To: References: Message-ID: On Mon, 14 Mar 2022 22:07:24 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Added helper function CollectedHeap::zap_filler_array_with > > test/hotspot/jtreg/runtime/cds/appcds/javaldr/LockDuringDumpAgent.java line 65: > >> 63: if (elapsed >= timeout) { >> 64: System.out.println("This JVM may decide to not launch any Java threads during -Xshare:dump."); >> 65: System.out.println("This is OK because no string objects be in a locked state during heap dump."); > > Should `no string objects be` be `no string objects could be`? Thanks for the review. I've fixed the comment as you suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From stuefe at openjdk.java.net Tue Mar 15 08:42:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 15 Mar 2022 08:42:44 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v13] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> <7wFKmyOltY-k4JZz1bqNXUsTJlciBq7bHMuSsddIheI=.d905592e-7f34-47d2-8699-da8ca674d5e6@github.com> Message-ID: On Tue, 15 Mar 2022 07:50:10 GMT, Johannes Bechberger wrote: >> src/hotspot/share/runtime/safefetch.inline.hpp line 61: >> >>> 59: return false; >>> 60: } >>> 61: #endif // __APPLE__ && AARCH64 >> >> Totally valid way to work around JDK-8282475, but note that `os::is_readable_pointer()` defaults to true for `CanSafeFetch=false` (see https://github.com/openjdk/jdk/blob/6013d09e82693a1c07cf0bf6daffd95114b3cbfa/src/hotspot/share/runtime/os.cpp#L1042-L1046) >> >> The story behind that is that there are more use cases of `is_readable_memory()` where - if one cannot use SafeFetch - it makes sense to optimistically assume the location is readable. E.g. in error handling, where secondary crashes are annoying but not deadly. The contract is that if you really care about this, use CanUseSafeFetch beforehand (usually it was fine to ignore it since CanUseSafeFetch only affected a small time window during VM initialization). >> >> With your patch, it means that now os::is_readable_memory always returns true if Thread::current is NULL. So in that case, we bypass your protection. I think that is acceptable for now. Once JDK-8282475 is solved we can remove this again. >> >> I wondered whether this affects your gtest test case, but in that test case Thread::current is not NULL because its a TEST_VM case, so we are in the VM. > > I could add a non VM test case? Or is it to difficult? Would not work, since signal handling is not present. I think not worth the effort. Lets ship this, has been cooking long enough. In theory, you could temporarily reset Thread::current to null, but that's hackish and possibly exposes you to a whole other range of follow up problems. Also I am currently extending SafeFetch tests in the course of JDK-8282475 and will test Thread::current=null there, so it will be covered eventually. ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From thartmann at openjdk.java.net Tue Mar 15 09:28:51 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 15 Mar 2022 09:28:51 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v6] In-Reply-To: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> References: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> Message-ID: <2LdpZiNnH5XrPqD_O3yHxVKNz-f4BsxFoswECEL09uo=.e11cdc0e-5fba-4e00-851e-c8ada9182c24@github.com> On Mon, 14 Mar 2022 06:13:30 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Update a few comments I'll run some testing and take a look at the changes later this week. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From thartmann at openjdk.java.net Tue Mar 15 09:31:38 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 15 Mar 2022 09:31:38 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 20:48:06 GMT, Doug Simon wrote: > [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) added abstract assembly to hs-err for methods on the stack of the crashing thread. However, it only does this if the crash is due to an unhandled signal. It can also be useful to see assembly for crashes due to failing VM assertions or guarantees. This PR implements this improvement. Okay, makes sense. The change looks reasonable to me but someone from the runtime folks should have a look as well. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7791 From thartmann at openjdk.java.net Tue Mar 15 09:38:49 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 15 Mar 2022 09:38:49 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Tue, 15 Mar 2022 00:26:14 GMT, Dean Long wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > rename max_stubs --> max_aligned_stubs, update copyright year Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7800 From shade at openjdk.java.net Tue Mar 15 09:52:38 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 15 Mar 2022 09:52:38 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: <0BC3ZlAWw6Lw3--ikTpTru1NuCCG8b5jZPTc5ME8yzs=.bcc609bd-6f29-46c2-8c1a-f751ac5d126e@github.com> On Tue, 15 Mar 2022 00:26:14 GMT, Dean Long wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > rename max_stubs --> max_aligned_stubs, update copyright year Looks good, thanks for taking care of this. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7800 From ihse at openjdk.java.net Tue Mar 15 10:00:45 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Tue, 15 Mar 2022 10:00:45 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v7] In-Reply-To: References: Message-ID: <2V0vZlddOZ64jgcAUBOOYbOs4BR7IBpNdtRgiy-kCnE=.0a6072a2-1b7e-40a3-b3f8-4f1dc02dcc37@github.com> On Tue, 15 Mar 2022 08:17:24 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - fixed copyright > - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2 > - @calvinccheung review: fixed typo > - Added helper function CollectedHeap::zap_filler_array_with > - @kimbarrett comments > - zero GC heap filler arrays > - improvement zeroing of alignment gaps > - Fixed zero build > - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2 > - fixed test > - ... and 2 more: https://git.openjdk.java.net/jdk/compare/1ab17a39...cd934f3c Build changes look good. ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7748 From redestad at openjdk.java.net Tue Mar 15 11:02:47 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 15 Mar 2022 11:02:47 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v15] In-Reply-To: References: Message-ID: <-T5CXmH7O8EAO-UeX913WJJTZQBdVAms8RNLQx1NJOU=.2befca9f-3850-4f33-93f3-73b89e13b10d@github.com> On Mon, 14 Mar 2022 20:30:51 GMT, Roger Riggs wrote: >> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright year in new test > > core libs String.java changes look fine. Thanks @RogerRiggs I intend to push this soon regardless, but would appreciate an explicit review of the aarch64 changes from an aarch64 maintainer. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From tschatzl at openjdk.java.net Tue Mar 15 15:26:10 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 15 Mar 2022 15:26:10 GMT Subject: RFR: 8283186: Explicitly pass a third temp register to MacroAssembler::store_heap_oop Message-ID: Hi all, can I have reviews for this change that explicitly passes a third temp parameter to `MacroAssembler::store_heap_oop` so that `G1BarrierSetAssembler::oop_store_at` (and the equivalent Shenandoah code) does not need to invent some out of thin air? This makes the code much less surprising. The interesting part of this change is probably the first hunk in `src/hotspot/cpu/x86/templateTable_x86.cpp`, the rest is just passing on that additional parameter. Testing: gha Thanks, Thomas ------------- Commit messages: - initial version Changes: https://git.openjdk.java.net/jdk/pull/7820/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7820&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283186 Stats: 58 lines in 16 files changed: 2 ins; 2 del; 54 mod Patch: https://git.openjdk.java.net/jdk/pull/7820.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7820/head:pull/7820 PR: https://git.openjdk.java.net/jdk/pull/7820 From eosterlund at openjdk.java.net Tue Mar 15 15:35:47 2022 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 15 Mar 2022 15:35:47 GMT Subject: RFR: 8283186: Explicitly pass a third temp register to MacroAssembler::store_heap_oop In-Reply-To: References: Message-ID: On Tue, 15 Mar 2022 15:20:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that explicitly passes a third temp parameter to `MacroAssembler::store_heap_oop` so that `G1BarrierSetAssembler::oop_store_at` (and the equivalent Shenandoah code) does not need to invent some out of thin air? This makes the code much less surprising. > > The interesting part of this change is probably the first hunk in `src/hotspot/cpu/x86/templateTable_x86.cpp`, the rest is just passing on that additional parameter. > > Testing: gha > > Thanks, > Thomas Looks awesome. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7820 From bulasevich at openjdk.java.net Tue Mar 15 16:26:29 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Tue, 15 Mar 2022 16:26:29 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v5] In-Reply-To: References: Message-ID: > Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. > > In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. > > As a side effect, the performance of some tests is slightly improved: > ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` > > Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: rename, adding test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7517/files - new: https://git.openjdk.java.net/jdk/pull/7517/files/91e62888..9cb03540 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=03-04 Stats: 147 lines in 5 files changed: 137 ins; 4 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/7517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517 PR: https://git.openjdk.java.net/jdk/pull/7517 From iklam at openjdk.java.net Tue Mar 15 17:08:27 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 15 Mar 2022 17:08:27 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v8] In-Reply-To: References: Message-ID: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Avoid memset twice in os::malloc(); added comments about NMTPreInit::handle_malloc vs DumpSharedSpaces ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7748/files - new: https://git.openjdk.java.net/jdk/pull/7748/files/cd934f3c..f202bcbf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=06-07 Stats: 7 lines in 1 file changed: 5 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7748.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748 PR: https://git.openjdk.java.net/jdk/pull/7748 From dlong at openjdk.java.net Tue Mar 15 20:21:47 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 15 Mar 2022 20:21:47 GMT Subject: RFR: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" [v3] In-Reply-To: References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: On Tue, 15 Mar 2022 00:26:14 GMT, Dean Long wrote: >> This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > rename max_stubs --> max_aligned_stubs, update copyright year Thanks Aleksey and Tobias. ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From dlong at openjdk.java.net Tue Mar 15 20:21:47 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 15 Mar 2022 20:21:47 GMT Subject: Integrated: 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" In-Reply-To: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> References: <4O-MhQ9Ymt-FKCA64KaxCqz6T_joJp5shOSgNl0IYF8=.051d8d11-0f6a-40e7-9efa-b3b6f4803a3c@github.com> Message-ID: <1la3OcwkNNBZX2Uv5m-covHN9VLf6qxfLfua-NSSgY0=.d5df9368-ba52-447b-8dd2-bf4a1309edf9@github.com> On Sat, 12 Mar 2022 23:28:43 GMT, Dean Long wrote: > This change adds extra stub space for large values of CodeEntryAlignment, and it changes the test to try large values of CodeEntryAlignment. This pull request has now been integrated. Changeset: 1465ea98 Author: Dean Long URL: https://git.openjdk.java.net/jdk/commit/1465ea98b7736b5960a8b546ccc366c3e3260bdd Stats: 17 lines in 2 files changed: 14 ins; 0 del; 3 mod 8282355: compiler/arguments/TestCodeEntryAlignment.java failed "guarantee(sect->end() <= tend) failed: sanity" Reviewed-by: jiefu, thartmann, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/7800 From ccheung at openjdk.java.net Tue Mar 15 23:43:52 2022 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 15 Mar 2022 23:43:52 GMT Subject: RFR: 8253495: CDS generates non-deterministic output [v8] In-Reply-To: References: Message-ID: On Tue, 15 Mar 2022 17:08:27 GMT, Ioi Lam wrote: >> This patch makes the result of "java -Xshare:dump" deterministic: >> - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp >> - Fixed a problem in hashtable ordering in heapShared.cpp >> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. >> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh >> >> Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). >> >> Testing under way: >> - tier1~tier5 >> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Avoid memset twice in os::malloc(); added comments about NMTPreInit::handle_malloc vs DumpSharedSpaces Marked as reviewed by ccheung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From duke at openjdk.java.net Wed Mar 16 01:28:04 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Wed, 16 Mar 2022 01:28:04 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations Message-ID: Hi, This patch improves the generation of broadcasting a scalar in several ways: - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. This patch also removes some redundant code paths and rename some incorrectly named instructions. Thank you very much. ------------- Commit messages: - fix - fix - minor changes - rematerialize - improve - fix - initial commit Changes: https://git.openjdk.java.net/jdk/pull/7832/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283232 Stats: 400 lines in 12 files changed: 237 ins; 79 del; 84 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From yyang at openjdk.java.net Wed Mar 16 02:46:56 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 16 Mar 2022 02:46:56 GMT Subject: RFR: 8283147: Include NonJavaThread stacktrace during thread dump Message-ID: When we use jcmd Thread.dump/jstack , we could dump all Java thread stack trace, but unfortunately we are not able to print NonJavaThread stack trace such as VMThread/GCWorker, etc. For these threads, we know nothing about what are they doing/are they blocked in pthread condition from jstack output. An alternative is to use pstack, it internally attaches destination process and uses `thread apply all bt`, which introduces more overhead and much more dangerous. ====== JStack Ouput(Currrent) ... "ApplicationImpl pooled thread 441" #1478 prio=4 os_prio=31 cpu=11.71ms elapsed=60.30s tid=0x00007f8d32171000 nid=0x22f23 waiting on condition [0x0000700010d5d000] java.lang.Thread.State: TIMED_WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base at 11.0.11/Native Method) - parking to wait for <0x00000007af851760> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 11.0.11/LockSupport.java:234) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base at 11.0.11/SynchronousQueue.java:462) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 11.0.11/SynchronousQueue.java:361) at java.util.concurrent.SynchronousQueue.poll(java.base at 11.0.11/SynchronousQueue.java:937) at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 11.0.11/ThreadPoolExecutor.java:1053) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 11.0.11/ThreadPoolExecutor.java:1114) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 11.0.11/ThreadPoolExecutor.java:628) at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:668) at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:665) at java.security.AccessController.doPrivileged(java.base at 11.0.11/Native Method) at java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(java.base at 11.0.11/Executors.java:665) at java.lang.Thread.run(java.base at 11.0.11/Thread.java:829) "VM Thread" os_prio=31 cpu=31205.83ms elapsed=154131.15s tid=0x00007f8d49046000 nid=0x4703 runnable "GC Thread#0" os_prio=31 cpu=3811.96ms elapsed=154131.18s tid=0x00007f8d49809800 nid=0x3603 runnable "GC Thread#1" os_prio=31 cpu=3749.09ms elapsed=154130.24s tid=0x00007f8d4a9b3000 nid=0x6103 runnable "GC Thread#2" os_prio=31 cpu=3745.73ms elapsed=154129.74s tid=0x00007f8d48249000 nid=0x12f27 runnable "GC Thread#3" os_prio=31 cpu=3692.77ms elapsed=154129.74s tid=0x00007f8d48b93000 nid=0xe50b runnable "GC Thread#4" os_prio=31 cpu=3728.57ms elapsed=154129.74s tid=0x00007f8d47b0b000 nid=0xe603 runnable "GC Thread#5" os_prio=31 cpu=3726.08ms elapsed=154129.74s tid=0x00007f8d47afc800 nid=0xe803 runnable "GC Thread#6" os_prio=31 cpu=3660.35ms elapsed=154129.02s tid=0x00007f8d48de5800 nid=0x15d2f runnable "GC Thread#7" os_prio=31 cpu=3676.68ms elapsed=154129.02s tid=0x00007f8d48dc4800 nid=0x16103 runnable "GC Thread#8" os_prio=31 cpu=3676.15ms elapsed=154128.31s tid=0x00007f8d4849d800 nid=0x1f503 runnable "GC Thread#9" os_prio=31 cpu=3570.95ms elapsed=154128.31s tid=0x00007f8d494ab000 nid=0x1f303 runnable "CMS Main Thread" os_prio=31 cpu=6715.33ms elapsed=154131.18s tid=0x00007f8d4780f800 nid=0x4b03 runnable "CMS Thread#0" os_prio=31 cpu=2429.86ms elapsed=154131.18s tid=0x00007f8d4900e000 nid=0x3703 runnable "CMS Thread#1" os_prio=31 cpu=2422.35ms elapsed=154129.72s tid=0x00007f8d4d044000 nid=0x11a03 runnable "CMS Thread#2" os_prio=31 cpu=2418.81ms elapsed=154129.72s tid=0x00007f8d48b93800 nid=0xea03 runnable "VM Periodic Task Thread" os_prio=31 cpu=10658.80ms elapsed=154130.41s tid=0x00007f8d49035000 nid=0xa003 waiting on condition JNI global refs: 660, weak refs: 1217 Most of above information makes no sense for further debugging. I think we can extend this functionality, e.g. add a new flag such as DumpAllThreadStackTrace, to print non java thread stack trace: ====== JStack Ouput(Modified) 2022-03-16 10:33:49 Full thread dump OpenJDK 64-Bit Server VM (19-internal-adhoc.qingfengyy.jdktip mixed mode, sharing): Threads class SMR info: _java_thread_list=0x00007fc0900015f0, length=15, elements={ 0x00007fc1281babd0, 0x00007fc1281bc190, 0x00007fc1281c21e0, 0x00007fc1281c36b0, 0x00007fc1281c4bc0, 0x00007fc1281c6740, 0x00007fc1281c7dc0, 0x00007fc1281c9340, 0x00007fc1281f8a70, 0x00007fc128209120, 0x00007fc12823ea50, 0x00007fc12823fb80, 0x00007fc128246f20, 0x00007fc1280255c0, 0x00007fc090000be0 } "Reference Handler" #2 daemon prio=10 os_prio=0 cpu=0.11ms elapsed=12.11s tid=0x00007fc1281babd0 nid=91395 waiting on condition [0x00007fc0b6877000] java.lang.Thread.State: RUNNABLE at java.lang.ref.Reference.waitForReferencePendingList(java.base at 19-internal/Native Method) at java.lang.ref.Reference.processPendingReferences(java.base at 19-internal/Reference.java:253) at java.lang.ref.Reference$ReferenceHandler.run(java.base at 19-internal/Reference.java:215) "Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.31ms elapsed=12.11s tid=0x00007fc1281bc190 nid=91396 in Object.wait() [0x00007fc0b6776000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base at 19-internal/Native Method) - waiting on <0x00000000a00027c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) - locked <0x00000000a00027c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:176) at java.lang.ref.Finalizer$FinalizerThread.run(java.base at 19-internal/Finalizer.java:183) "Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.29ms elapsed=12.10s tid=0x00007fc1281c21e0 nid=91397 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.07ms elapsed=12.10s tid=0x00007fc1281c36b0 nid=91398 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=0.17ms elapsed=12.10s tid=0x00007fc1281c4bc0 nid=91399 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=19.05ms elapsed=12.10s tid=0x00007fc1281c6740 nid=91400 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE No compile task "C1 CompilerThread0" #19 daemon prio=9 os_prio=0 cpu=38.05ms elapsed=12.10s tid=0x00007fc1281c7dc0 nid=91401 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE No compile task "Sweeper thread" #25 daemon prio=9 os_prio=0 cpu=0.04ms elapsed=12.10s tid=0x00007fc1281c9340 nid=91402 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Notification Thread" #26 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=12.09s tid=0x00007fc1281f8a70 nid=91409 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Common-Cleaner" #27 daemon prio=8 os_prio=0 cpu=0.12ms elapsed=12.08s tid=0x00007fc128209120 nid=91413 in Object.wait() [0x00007fc0b5948000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(java.base at 19-internal/Native Method) - waiting on <0x00000000a010aba8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) - locked <0x00000000a010aba8> (a java.lang.ref.ReferenceQueue$Lock) at jdk.internal.ref.CleanerImpl.run(java.base at 19-internal/CleanerImpl.java:140) at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) at jdk.internal.misc.InnocuousThread.run(java.base at 19-internal/InnocuousThread.java:162) "server-timer" #28 daemon prio=5 os_prio=0 cpu=0.23ms elapsed=12.03s tid=0x00007fc12823ea50 nid=91418 in Object.wait() [0x00007fc0b5847000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(java.base at 19-internal/Native Method) - waiting on <0x00000000a01ff400> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(java.base at 19-internal/Timer.java:563) - locked <0x00000000a01ff400> (a java.util.TaskQueue) at java.util.TimerThread.run(java.base at 19-internal/Timer.java:516) "server-timer1" #29 daemon prio=5 os_prio=0 cpu=0.81ms elapsed=12.03s tid=0x00007fc12823fb80 nid=91419 in Object.wait() [0x00007fc0b5746000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(java.base at 19-internal/Native Method) - waiting on <0x00000000a01ffa98> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(java.base at 19-internal/Timer.java:563) - locked <0x00000000a01ffa98> (a java.util.TaskQueue) at java.util.TimerThread.run(java.base at 19-internal/Timer.java:516) "HTTP-Dispatcher" #30 prio=5 os_prio=0 cpu=0.98ms elapsed=12.02s tid=0x00007fc128246f20 nid=91421 runnable [0x00007fc0b5645000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPoll.wait(java.base at 19-internal/Native Method) at sun.nio.ch.EPollSelectorImpl.doSelect(java.base at 19-internal/EPollSelectorImpl.java:118) at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base at 19-internal/SelectorImpl.java:129) - locked <0x00000000a01fc928> (a sun.nio.ch.Util$2) - locked <0x00000000a01fc598> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(java.base at 19-internal/SelectorImpl.java:141) at sun.net.httpserver.ServerImpl$Dispatcher.run(jdk.httpserver at 19-internal/ServerImpl.java:373) at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) "DestroyJavaVM" #31 prio=5 os_prio=0 cpu=108.15ms elapsed=12.01s tid=0x00007fc1280255c0 nid=91386 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Attach Listener" #32 daemon prio=9 os_prio=0 cpu=0.18ms elapsed=0.10s tid=0x00007fc090000be0 nid=91781 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "VM Thread" os_prio=0 cpu=0.63ms elapsed=12.11s tid=0x00007fc1281b7900 nid=91394 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 "GC Thread#0" os_prio=0 cpu=0.11ms elapsed=12.12s tid=0x00007fc128066a20 nid=91387 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xf4a977] WorkerThreads::threads_do(ThreadClosure*) const+0x37 V [libjvm.so+0x70cf16] G1CollectedHeap::gc_threads_do(ThreadClosure*) const+0x16 V [libjvm.so+0xe98624] Threads::print_on(outputStream*, bool, bool, bool, bool)+0x4a4 V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 "G1 Main Marker" os_prio=0 cpu=0.03ms elapsed=12.12s tid=0x00007fc128076ee0 nid=91388 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x70cf26] G1CollectedHeap::gc_threads_do(ThreadClosure*) const+0x26 V [libjvm.so+0xe98624] Threads::print_on(outputStream*, bool, bool, bool, bool)+0x4a4 V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 "G1 Conc#0" os_prio=0 cpu=0.02ms elapsed=12.12s tid=0x00007fc128077f60 nid=91389 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xf4a977] WorkerThreads::threads_do(ThreadClosure*) const+0x37 V [libjvm.so+0x70cf35] G1CollectedHeap::gc_threads_do(ThreadClosure*) const+0x35 V [libjvm.so+0xe98624] Threads::print_on(outputStream*, bool, bool, bool, bool)+0x4a4 V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 "G1 Refine#0" os_prio=0 cpu=0.03ms elapsed=12.13s tid=0x00007fc1281881e0 nid=91390 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x7275c8] G1ConcurrentRefine::threads_do(ThreadClosure*)+0x78 V [libjvm.so+0x70cf44] G1CollectedHeap::gc_threads_do(ThreadClosure*) const+0x44 V [libjvm.so+0xe98624] Threads::print_on(outputStream*, bool, bool, bool, bool)+0x4a4 V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 "G1 Service" os_prio=0 cpu=0.71ms elapsed=12.13s tid=0x00007fc1281892a0 nid=91391 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xe98624] Threads::print_on(outputStream*, bool, bool, bool, bool)+0x4a4 V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 "VM Periodic Task Thread" os_prio=0 cpu=2.22ms elapsed=12.10s tid=0x00007fc1281fa560 nid=91410 waiting on condition Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xf21da5] VM_PrintThreads::doit()+0x25 V [libjvm.so+0xf2248a] VM_Operation::evaluate()+0xea V [libjvm.so+0xf23df8] VMThread::evaluate_operation(VM_Operation*)+0xb8 V [libjvm.so+0xf244a7] VMThread::inner_execute(VM_Operation*)+0x3a7 V [libjvm.so+0xf24757] VMThread::run()+0xb7 V [libjvm.so+0xe99680] Thread::call_run()+0xc0 V [libjvm.so+0xc379f8] thread_native_entry(Thread*)+0xd8 JNI global refs: 28, weak refs: 0 ------------- Commit messages: - 8283147: Include NonJavaThread stacktrace during thread dump Changes: https://git.openjdk.java.net/jdk/pull/7833/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7833&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283147 Stats: 46 lines in 4 files changed: 39 ins; 6 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7833.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7833/head:pull/7833 PR: https://git.openjdk.java.net/jdk/pull/7833 From yyang at openjdk.java.net Wed Mar 16 02:52:21 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 16 Mar 2022 02:52:21 GMT Subject: RFR: 8283147: Include NonJavaThread stacktrace during thread dump [v2] In-Reply-To: References: Message-ID: <5bBWHAl5WIKEP7bmCJ3XhpnCknt63mKuZ1scZpCfjsk=.2522ecd1-9980-4fcb-b3c3-dc558aaf5b41@github.com> > When we use jcmd Thread.dump/jstack , we could dump all Java thread stack trace, but unfortunately we are not able to print NonJavaThread stack trace such as VMThread/GCWorker, etc. For these threads, we know nothing about what are they doing/are they blocked in pthread condition from jstack output. An alternative is to use pstack, it internally attaches destination process and uses `thread apply all bt`, which introduces more overhead and much more dangerous. > > ====== JStack Ouput(Currrent) > > ... > "ApplicationImpl pooled thread 441" #1478 prio=4 os_prio=31 cpu=11.71ms elapsed=60.30s tid=0x00007f8d32171000 nid=0x22f23 waiting on condition [0x0000700010d5d000] > java.lang.Thread.State: TIMED_WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base at 11.0.11/Native Method) > - parking to wait for <0x00000007af851760> (a java.util.concurrent.SynchronousQueue$TransferStack) > at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 11.0.11/LockSupport.java:234) > at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base at 11.0.11/SynchronousQueue.java:462) > at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 11.0.11/SynchronousQueue.java:361) > at java.util.concurrent.SynchronousQueue.poll(java.base at 11.0.11/SynchronousQueue.java:937) > at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 11.0.11/ThreadPoolExecutor.java:1053) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 11.0.11/ThreadPoolExecutor.java:1114) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 11.0.11/ThreadPoolExecutor.java:628) > at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:668) > at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:665) > at java.security.AccessController.doPrivileged(java.base at 11.0.11/Native Method) > at java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(java.base at 11.0.11/Executors.java:665) > at java.lang.Thread.run(java.base at 11.0.11/Thread.java:829) > > "VM Thread" os_prio=31 cpu=31205.83ms elapsed=154131.15s tid=0x00007f8d49046000 nid=0x4703 runnable > > "GC Thread#0" os_prio=31 cpu=3811.96ms elapsed=154131.18s tid=0x00007f8d49809800 nid=0x3603 runnable > > "GC Thread#1" os_prio=31 cpu=3749.09ms elapsed=154130.24s tid=0x00007f8d4a9b3000 nid=0x6103 runnable > > "GC Thread#2" os_prio=31 cpu=3745.73ms elapsed=154129.74s tid=0x00007f8d48249000 nid=0x12f27 runnable > > "GC Thread#3" os_prio=31 cpu=3692.77ms elapsed=154129.74s tid=0x00007f8d48b93000 nid=0xe50b runnable > > "GC Thread#4" os_prio=31 cpu=3728.57ms elapsed=154129.74s tid=0x00007f8d47b0b000 nid=0xe603 runnable > > "GC Thread#5" os_prio=31 cpu=3726.08ms elapsed=154129.74s tid=0x00007f8d47afc800 nid=0xe803 runnable > > "GC Thread#6" os_prio=31 cpu=3660.35ms elapsed=154129.02s tid=0x00007f8d48de5800 nid=0x15d2f runnable > > "GC Thread#7" os_prio=31 cpu=3676.68ms elapsed=154129.02s tid=0x00007f8d48dc4800 nid=0x16103 runnable > > "GC Thread#8" os_prio=31 cpu=3676.15ms elapsed=154128.31s tid=0x00007f8d4849d800 nid=0x1f503 runnable > > "GC Thread#9" os_prio=31 cpu=3570.95ms elapsed=154128.31s tid=0x00007f8d494ab000 nid=0x1f303 runnable > > "CMS Main Thread" os_prio=31 cpu=6715.33ms elapsed=154131.18s tid=0x00007f8d4780f800 nid=0x4b03 runnable > > "CMS Thread#0" os_prio=31 cpu=2429.86ms elapsed=154131.18s tid=0x00007f8d4900e000 nid=0x3703 runnable > > "CMS Thread#1" os_prio=31 cpu=2422.35ms elapsed=154129.72s tid=0x00007f8d4d044000 nid=0x11a03 runnable > > "CMS Thread#2" os_prio=31 cpu=2418.81ms elapsed=154129.72s tid=0x00007f8d48b93800 nid=0xea03 runnable > > "VM Periodic Task Thread" os_prio=31 cpu=10658.80ms elapsed=154130.41s tid=0x00007f8d49035000 nid=0xa003 waiting on condition > > JNI global refs: 660, weak refs: 1217 > > > Most of above information makes no sense for further debugging. I think we can extend this functionality, e.g. add a new flag such as DumpAllThreadStackTrace, to print non java thread stack trace: > > ====== JStack Ouput(Modified) > > 2022-03-16 10:46:55 > Full thread dump OpenJDK 64-Bit Server VM (19-internal-adhoc.qingfengyy.jdktip mixed mode, sharing): > > Threads class SMR info: > _java_thread_list=0x00007f15040015f0, length=22, elements={ > 0x00007f159c0255b0, 0x00007f159c1babc0, 0x00007f159c1bc180, 0x00007f159c1c21d0, > 0x00007f159c1c36a0, 0x00007f159c1c4bb0, 0x00007f159c1c6730, 0x00007f159c1c7db0, > 0x00007f159c1c9330, 0x00007f159c1fc300, 0x00007f159c211a60, 0x00007f159c213b60, > 0x00007f159c302960, 0x00007f14cc0319d0, 0x00007f14cc0375c0, 0x00007f159c307e80, > 0x00007f159c30db30, 0x00007f159c3e6db0, 0x00007f159c647300, 0x00007f159c64b600, > 0x00007f159c678910, 0x00007f1504000be0 > } > > "main" #1 prio=5 os_prio=0 cpu=766.48ms elapsed=23.73s tid=0x00007f159c0255b0 nid=115919 in Object.wait() [0x00007f15a3e58000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.read(jdk.internal.le at 19-internal/NonBlockingInputStreamImpl.java:139) > - locked <0x00000000a2000368> (a jdk.internal.jshell.tool.ConsoleIOContext$1) > at jdk.internal.org.jline.utils.NonBlockingInputStream.read(jdk.internal.le at 19-internal/NonBlockingInputStream.java:62) > at jdk.internal.org.jline.utils.NonBlocking$NonBlockingInputStreamReader.read(jdk.internal.le at 19-internal/NonBlocking.java:168) > at jdk.internal.org.jline.utils.NonBlockingReader.read(jdk.internal.le at 19-internal/NonBlockingReader.java:57) > at jdk.internal.org.jline.keymap.BindingReader.readCharacter(jdk.internal.le at 19-internal/BindingReader.java:160) > at jdk.internal.org.jline.keymap.BindingReader.readBinding(jdk.internal.le at 19-internal/BindingReader.java:110) > at jdk.internal.org.jline.keymap.BindingReader.readBinding(jdk.internal.le at 19-internal/BindingReader.java:61) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.doReadBinding(jdk.internal.le at 19-internal/LineReaderImpl.java:923) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.readBinding(jdk.internal.le at 19-internal/LineReaderImpl.java:956) > at jdk.internal.jshell.tool.ConsoleIOContext$2.readBinding(jdk.jshell at 19-internal/ConsoleIOContext.java:173) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.readLine(jdk.internal.le at 19-internal/LineReaderImpl.java:651) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.readLine(jdk.internal.le at 19-internal/LineReaderImpl.java:468) > at jdk.internal.jshell.tool.ConsoleIOContext.readLine(jdk.jshell at 19-internal/ConsoleIOContext.java:249) > at jdk.internal.jshell.tool.JShellTool.getInput(jdk.jshell at 19-internal/JShellTool.java:1281) > at jdk.internal.jshell.tool.JShellTool.run(jdk.jshell at 19-internal/JShellTool.java:1215) > at jdk.internal.jshell.tool.JShellTool.start(jdk.jshell at 19-internal/JShellTool.java:1001) > at jdk.internal.jshell.tool.JShellToolBuilder.start(jdk.jshell at 19-internal/JShellToolBuilder.java:261) > at jdk.internal.jshell.tool.JShellToolProvider.main(jdk.jshell at 19-internal/JShellToolProvider.java:120) > > "Reference Handler" #2 daemon prio=10 os_prio=0 cpu=1.04ms elapsed=23.72s tid=0x00007f159c1babc0 nid=115926 waiting on condition [0x00007f1529302000] > java.lang.Thread.State: RUNNABLE > at java.lang.ref.Reference.waitForReferencePendingList(java.base at 19-internal/Native Method) > at java.lang.ref.Reference.processPendingReferences(java.base at 19-internal/Reference.java:253) > at java.lang.ref.Reference$ReferenceHandler.run(java.base at 19-internal/Reference.java:215) > > "Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.35ms elapsed=23.72s tid=0x00007f159c1bc180 nid=115927 in Object.wait() [0x00007f1529201000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on <0x00000000a00002e8> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) > - locked <0x00000000a00002e8> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:176) > at java.lang.ref.Finalizer$FinalizerThread.run(java.base at 19-internal/Finalizer.java:183) > > "Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.24ms elapsed=23.71s tid=0x00007f159c1c21d0 nid=115928 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.20ms elapsed=23.71s tid=0x00007f159c1c36a0 nid=115929 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=0.24ms elapsed=23.71s tid=0x00007f159c1c4bb0 nid=115930 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=527.71ms elapsed=23.71s tid=0x00007f159c1c6730 nid=115931 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > No compile task > > "C1 CompilerThread0" #19 daemon prio=9 os_prio=0 cpu=216.87ms elapsed=23.71s tid=0x00007f159c1c7db0 nid=115932 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > No compile task > > "Sweeper thread" #25 daemon prio=9 os_prio=0 cpu=0.04ms elapsed=23.71s tid=0x00007f159c1c9330 nid=115933 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Notification Thread" #26 daemon prio=9 os_prio=0 cpu=0.05ms elapsed=23.70s tid=0x00007f159c1fc300 nid=115936 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Common-Cleaner" #27 daemon prio=8 os_prio=0 cpu=0.99ms elapsed=23.68s tid=0x00007f159c211a60 nid=115938 in Object.wait() [0x00007f15282d2000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) > - locked <0x00000000a0000628> (a java.lang.ref.ReferenceQueue$Lock) > at jdk.internal.ref.CleanerImpl.run(java.base at 19-internal/CleanerImpl.java:140) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > at jdk.internal.misc.InnocuousThread.run(java.base at 19-internal/InnocuousThread.java:162) > > "Timer-0" #28 daemon prio=5 os_prio=0 cpu=0.10ms elapsed=23.68s tid=0x00007f159c213b60 nid=115939 in Object.wait() [0x00007f15281d1000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on <0x00000000a00007e8> (a java.util.TaskQueue) > at java.util.TimerThread.mainLoop(java.base at 19-internal/Timer.java:563) > - locked <0x00000000a00007e8> (a java.util.TaskQueue) > at java.util.TimerThread.run(java.base at 19-internal/Timer.java:516) > > "process reaper" #30 daemon prio=10 os_prio=0 cpu=0.17ms elapsed=23.53s tid=0x00007f159c302960 nid=115944 runnable [0x00007f15a3c9e000] > java.lang.Thread.State: RUNNABLE > at java.lang.ProcessHandleImpl.waitForProcessExit0(java.base at 19-internal/Native Method) > at java.lang.ProcessHandleImpl$1.run(java.base at 19-internal/ProcessHandleImpl.java:147) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1136) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "JDI Internal Event Handler" #34 daemon prio=5 os_prio=0 cpu=14.16ms elapsed=23.49s tid=0x00007f14cc0319d0 nid=115971 in Object.wait() [0x00007f15096b7000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.Object.wait(java.base at 19-internal/Object.java:338) > at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(jdk.jdi at 19-internal/EventQueueImpl.java:190) > - locked <0x00000000a0000e80> (a com.sun.tools.jdi.EventQueueImpl) > at com.sun.tools.jdi.EventQueueImpl.removeInternal(jdk.jdi at 19-internal/EventQueueImpl.java:125) > at com.sun.tools.jdi.InternalEventHandler.run(jdk.jdi at 19-internal/InternalEventHandler.java:61) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "JDI Target VM Interface" #33 daemon prio=5 os_prio=0 cpu=16.02ms elapsed=23.49s tid=0x00007f14cc0375c0 nid=115972 runnable [0x00007f15095b6000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.SocketDispatcher.read0(java.base at 19-internal/Native Method) > at sun.nio.ch.SocketDispatcher.read(java.base at 19-internal/SocketDispatcher.java:47) > at sun.nio.ch.NioSocketImpl.tryRead(java.base at 19-internal/NioSocketImpl.java:258) > at sun.nio.ch.NioSocketImpl.implRead(java.base at 19-internal/NioSocketImpl.java:309) > at sun.nio.ch.NioSocketImpl.read(java.base at 19-internal/NioSocketImpl.java:347) > at sun.nio.ch.NioSocketImpl$1.read(java.base at 19-internal/NioSocketImpl.java:800) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:966) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:961) > at com.sun.tools.jdi.SocketConnection.readPacket(jdk.jdi at 19-internal/SocketConnection.java:82) > - locked <0x00000000a00012e0> (a java.lang.Object) > at com.sun.tools.jdi.TargetVM.run(jdk.jdi at 19-internal/TargetVM.java:123) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "event-handler" #35 daemon prio=5 os_prio=0 cpu=8.29ms elapsed=23.48s tid=0x00007f159c307e80 nid=115973 in Object.wait() [0x00007f15098b9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.Object.wait(java.base at 19-internal/Object.java:338) > at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(jdk.jdi at 19-internal/EventQueueImpl.java:190) > - locked <0x00000000a0001e98> (a com.sun.tools.jdi.EventQueueImpl) > at com.sun.tools.jdi.EventQueueImpl.remove(jdk.jdi at 19-internal/EventQueueImpl.java:97) > at com.sun.tools.jdi.EventQueueImpl.remove(jdk.jdi at 19-internal/EventQueueImpl.java:83) > at jdk.jshell.execution.JdiEventHandler.run(jdk.jshell at 19-internal/JdiEventHandler.java:79) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "output reader" #36 daemon prio=5 os_prio=0 cpu=0.30ms elapsed=23.44s tid=0x00007f159c30db30 nid=115990 runnable [0x00007f15097b8000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.SocketDispatcher.read0(java.base at 19-internal/Native Method) > at sun.nio.ch.SocketDispatcher.read(java.base at 19-internal/SocketDispatcher.java:47) > at sun.nio.ch.NioSocketImpl.tryRead(java.base at 19-internal/NioSocketImpl.java:258) > at sun.nio.ch.NioSocketImpl.implRead(java.base at 19-internal/NioSocketImpl.java:309) > at sun.nio.ch.NioSocketImpl.read(java.base at 19-internal/NioSocketImpl.java:347) > at sun.nio.ch.NioSocketImpl$1.read(java.base at 19-internal/NioSocketImpl.java:800) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:966) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:961) > at java.io.FilterInputStream.read(java.base at 19-internal/FilterInputStream.java:79) > at jdk.jshell.execution.DemultiplexInput.run(jdk.jshell at 19-internal/DemultiplexInput.java:60) > > "Thread-1" #38 daemon prio=5 os_prio=0 cpu=222.19ms elapsed=23.37s tid=0x00007f159c3e6db0 nid=115995 waiting on condition [0x00007f15094b5000] > java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base at 19-internal/Native Method) > - parking to wait for <0x00000000a007a880> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(java.base at 19-internal/LockSupport.java:341) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(java.base at 19-internal/AbstractQueuedSynchronizer.java:506) > at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base at 19-internal/ForkJoinPool.java:3464) > at java.util.concurrent.ForkJoinPool.managedBlock(java.base at 19-internal/ForkJoinPool.java:3435) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base at 19-internal/AbstractQueuedSynchronizer.java:1623) > at java.util.concurrent.LinkedBlockingQueue.take(java.base at 19-internal/LinkedBlockingQueue.java:435) > at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 19-internal/ThreadPoolExecutor.java:1062) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1122) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "process reaper" #40 daemon prio=10 os_prio=0 cpu=1.83ms elapsed=22.94s tid=0x00007f159c647300 nid=116044 waiting on condition [0x00007f15280d0000] > java.lang.Thread.State: TIMED_WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base at 19-internal/Native Method) > - parking to wait for <0x00000000a0079fb0> (a java.util.concurrent.SynchronousQueue$TransferStack) > at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 19-internal/LockSupport.java:252) > at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 19-internal/SynchronousQueue.java:401) > at java.util.concurrent.SynchronousQueue.poll(java.base at 19-internal/SynchronousQueue.java:903) > at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 19-internal/ThreadPoolExecutor.java:1061) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1122) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "Thread-3" #41 daemon prio=5 os_prio=0 cpu=1.39ms elapsed=22.94s tid=0x00007f159c64b600 nid=116045 runnable [0x00007f14734ed000] > java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.read0(java.base at 19-internal/Native Method) > at java.io.FileInputStream.read(java.base at 19-internal/FileInputStream.java:228) > at jdk.internal.org.jline.terminal.impl.AbstractPty$PtyInputStream.read(jdk.internal.le at 19-internal/AbstractPty.java:73) > at jdk.internal.org.jline.utils.NonBlockingInputStream.read(jdk.internal.le at 19-internal/NonBlockingInputStream.java:62) > at jdk.internal.jshell.tool.StopDetectingInputStream.lambda$setInputStream$0(jdk.jshell at 19-internal/StopDetectingInputStream.java:74) > at jdk.internal.jshell.tool.StopDetectingInputStream$$Lambda$355/0x0000000800dd68d0.run(jdk.jshell at 19-internal/Unknown Source) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "null non blocking reader thread" #43 daemon prio=5 os_prio=0 cpu=0.15ms elapsed=22.78s tid=0x00007f159c678910 nid=116063 in Object.wait() [0x00007f1472ee6000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.Object.wait(java.base at 19-internal/Object.java:338) > at jdk.internal.jshell.tool.StopDetectingInputStream.read(jdk.jshell at 19-internal/StopDetectingInputStream.java:111) > - locked <0x00000000a2000908> (a jdk.internal.jshell.tool.StopDetectingInputStream) > at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.run(jdk.internal.le at 19-internal/NonBlockingInputStreamImpl.java:216) > at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl$$Lambda$530/0x0000000800dfde40.run(jdk.internal.le at 19-internal/Unknown Source) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "Attach Listener" #44 daemon prio=9 os_prio=0 cpu=0.17ms elapsed=0.10s tid=0x00007f1504000be0 nid=116383 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "VM Thread" os_prio=0 cpu=8.31ms elapsed=23.72s tid=0x00007f159c1b78f0 nid=115925 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0xf21ec5] VM_PrintThreads::doit()+0x25 > V [libjvm.so+0xf225aa] VM_Operation::evaluate()+0xea > V [libjvm.so+0xf23f18] VMThread::evaluate_operation(VM_Operation*)+0xb8 > V [libjvm.so+0xf245c7] VMThread::inner_execute(VM_Operation*)+0x3a7 > V [libjvm.so+0xf24877] VMThread::run()+0xb7 > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > "GC Thread#0" os_prio=0 cpu=39.19ms elapsed=23.73s tid=0x00007f159c066a10 nid=115920 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#1" os_prio=0 cpu=9.75ms elapsed=23.16s tid=0x00007f15180140e0 nid=116014 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#2" os_prio=0 cpu=0.48ms elapsed=23.15s tid=0x00007f151801b730 nid=116015 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#3" os_prio=0 cpu=11.09ms elapsed=23.15s tid=0x00007f151801c160 nid=116016 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#4" os_prio=0 cpu=24.95ms elapsed=23.15s tid=0x00007f151801ccc0 nid=116017 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#5" os_prio=0 cpu=0.58ms elapsed=23.15s tid=0x00007f151801d820 nid=116018 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#6" os_prio=0 cpu=35.58ms elapsed=23.15s tid=0x00007f151801e380 nid=116019 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#7" os_prio=0 cpu=13.88ms elapsed=22.98s tid=0x00007f151801b050 nid=116042 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#8" os_prio=0 cpu=12.97ms elapsed=22.92s tid=0x00007f1518021950 nid=116047 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Main Marker" os_prio=0 cpu=0.84ms elapsed=23.73s tid=0x00007f159c076ed0 nid=115921 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xba33] pthread_cond_wait+0xc3 > V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 > V [libjvm.so+0x725fda] G1ConcurrentMarkThread::wait_for_next_cycle()+0x3a > V [libjvm.so+0x7270bb] G1ConcurrentMarkThread::run_service()+0xdb > V [libjvm.so+0x60be0b] ConcurrentGCThread::run()+0x1b > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > "G1 Conc#0" os_prio=0 cpu=6.39ms elapsed=23.73s tid=0x00007f159c077f50 nid=115922 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#1" os_prio=0 cpu=1.74ms elapsed=23.05s tid=0x00007f152c000960 nid=116022 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#2" os_prio=0 cpu=1.23ms elapsed=23.05s tid=0x00007f152c001490 nid=116023 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#3" os_prio=0 cpu=2.04ms elapsed=23.05s tid=0x00007f152c001ff0 nid=116024 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#4" os_prio=0 cpu=1.28ms elapsed=23.05s tid=0x00007f152c002b50 nid=116025 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#5" os_prio=0 cpu=1.60ms elapsed=23.05s tid=0x00007f152c0036b0 nid=116026 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#6" os_prio=0 cpu=1.61ms elapsed=23.05s tid=0x00007f152c004210 nid=116027 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#7" os_prio=0 cpu=2.00ms elapsed=23.05s tid=0x00007f152c005160 nid=116028 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#8" os_prio=0 cpu=1.58ms elapsed=23.05s tid=0x00007f152c0060b0 nid=116029 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#9" os_prio=0 cpu=1.26ms elapsed=23.05s tid=0x00007f152c007000 nid=116030 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#10" os_prio=0 cpu=1.61ms elapsed=23.05s tid=0x00007f152c007f50 nid=116031 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#11" os_prio=0 cpu=1.52ms elapsed=23.05s tid=0x00007f152c008ea0 nid=116032 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#12" os_prio=0 cpu=1.17ms elapsed=23.05s tid=0x00007f152c009df0 nid=116033 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#13" os_prio=0 cpu=1.88ms elapsed=23.05s tid=0x00007f152c00ad40 nid=116034 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#14" os_prio=0 cpu=1.31ms elapsed=23.05s tid=0x00007f152c00bc90 nid=116035 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#15" os_prio=0 cpu=1.52ms elapsed=23.05s tid=0x00007f152c00cbe0 nid=116036 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Refine#0" os_prio=0 cpu=0.03ms elapsed=23.73s tid=0x00007f159c1881d0 nid=115923 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Service" os_prio=0 cpu=0.72ms elapsed=23.73s tid=0x00007f159c189290 nid=115924 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xbde2] pthread_cond_timedwait+0x132 > V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 > V [libjvm.so+0x77d208] G1ServiceThread::wait_for_task()+0xf8 > V [libjvm.so+0x77d600] G1ServiceThread::run_service()+0x20 > V [libjvm.so+0x60be0b] ConcurrentGCThread::run()+0x1b > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > "VM Periodic Task Thread" os_prio=0 cpu=4.05ms elapsed=23.71s tid=0x00007f159c1fddf0 nid=115937 waiting on condition > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xbde2] pthread_cond_timedwait+0x132 > V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 > V [libjvm.so+0xc07015] WatcherThread::sleep() const+0xa5 > V [libjvm.so+0xc070e5] WatcherThread::run()+0x35 > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > JNI global refs: 28, weak refs: 0 Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8283147: Include NonJavaThread stacktrace during thread dump ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7833/files - new: https://git.openjdk.java.net/jdk/pull/7833/files/862c6189..ab1b879c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7833&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7833&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7833.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7833/head:pull/7833 PR: https://git.openjdk.java.net/jdk/pull/7833 From yyang at openjdk.java.net Wed Mar 16 02:57:20 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 16 Mar 2022 02:57:20 GMT Subject: RFR: 8283147: Include NonJavaThread stacktrace during thread dump [v3] In-Reply-To: References: Message-ID: > When we use jcmd Thread.dump/jstack , we could dump all Java thread stack trace, but unfortunately we are not able to print NonJavaThread stack trace such as VMThread/GCWorker, etc. For these threads, we know nothing about what are they doing/are they blocked in pthread condition from jstack output. An alternative is to use pstack, it internally attaches destination process and uses `thread apply all bt`, which introduces more overhead and much more dangerous. > > ====== JStack Ouput(Currrent) > > ... > "ApplicationImpl pooled thread 441" #1478 prio=4 os_prio=31 cpu=11.71ms elapsed=60.30s tid=0x00007f8d32171000 nid=0x22f23 waiting on condition [0x0000700010d5d000] > java.lang.Thread.State: TIMED_WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base at 11.0.11/Native Method) > - parking to wait for <0x00000007af851760> (a java.util.concurrent.SynchronousQueue$TransferStack) > at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 11.0.11/LockSupport.java:234) > at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base at 11.0.11/SynchronousQueue.java:462) > at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 11.0.11/SynchronousQueue.java:361) > at java.util.concurrent.SynchronousQueue.poll(java.base at 11.0.11/SynchronousQueue.java:937) > at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 11.0.11/ThreadPoolExecutor.java:1053) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 11.0.11/ThreadPoolExecutor.java:1114) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 11.0.11/ThreadPoolExecutor.java:628) > at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:668) > at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:665) > at java.security.AccessController.doPrivileged(java.base at 11.0.11/Native Method) > at java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(java.base at 11.0.11/Executors.java:665) > at java.lang.Thread.run(java.base at 11.0.11/Thread.java:829) > > "VM Thread" os_prio=31 cpu=31205.83ms elapsed=154131.15s tid=0x00007f8d49046000 nid=0x4703 runnable > > "GC Thread#0" os_prio=31 cpu=3811.96ms elapsed=154131.18s tid=0x00007f8d49809800 nid=0x3603 runnable > > "GC Thread#1" os_prio=31 cpu=3749.09ms elapsed=154130.24s tid=0x00007f8d4a9b3000 nid=0x6103 runnable > > "GC Thread#2" os_prio=31 cpu=3745.73ms elapsed=154129.74s tid=0x00007f8d48249000 nid=0x12f27 runnable > > "GC Thread#3" os_prio=31 cpu=3692.77ms elapsed=154129.74s tid=0x00007f8d48b93000 nid=0xe50b runnable > > "GC Thread#4" os_prio=31 cpu=3728.57ms elapsed=154129.74s tid=0x00007f8d47b0b000 nid=0xe603 runnable > > "GC Thread#5" os_prio=31 cpu=3726.08ms elapsed=154129.74s tid=0x00007f8d47afc800 nid=0xe803 runnable > > "GC Thread#6" os_prio=31 cpu=3660.35ms elapsed=154129.02s tid=0x00007f8d48de5800 nid=0x15d2f runnable > > "GC Thread#7" os_prio=31 cpu=3676.68ms elapsed=154129.02s tid=0x00007f8d48dc4800 nid=0x16103 runnable > > "GC Thread#8" os_prio=31 cpu=3676.15ms elapsed=154128.31s tid=0x00007f8d4849d800 nid=0x1f503 runnable > > "GC Thread#9" os_prio=31 cpu=3570.95ms elapsed=154128.31s tid=0x00007f8d494ab000 nid=0x1f303 runnable > > "CMS Main Thread" os_prio=31 cpu=6715.33ms elapsed=154131.18s tid=0x00007f8d4780f800 nid=0x4b03 runnable > > "CMS Thread#0" os_prio=31 cpu=2429.86ms elapsed=154131.18s tid=0x00007f8d4900e000 nid=0x3703 runnable > > "CMS Thread#1" os_prio=31 cpu=2422.35ms elapsed=154129.72s tid=0x00007f8d4d044000 nid=0x11a03 runnable > > "CMS Thread#2" os_prio=31 cpu=2418.81ms elapsed=154129.72s tid=0x00007f8d48b93800 nid=0xea03 runnable > > "VM Periodic Task Thread" os_prio=31 cpu=10658.80ms elapsed=154130.41s tid=0x00007f8d49035000 nid=0xa003 waiting on condition > > JNI global refs: 660, weak refs: 1217 > > > Most of above information makes no sense for further debugging. I think we can extend this functionality, e.g. add a new flag such as DumpAllThreadStackTrace, to print non java thread stack trace: > > ====== JStack Ouput(Modified) > > 2022-03-16 10:46:55 > Full thread dump OpenJDK 64-Bit Server VM (19-internal-adhoc.qingfengyy.jdktip mixed mode, sharing): > > Threads class SMR info: > _java_thread_list=0x00007f15040015f0, length=22, elements={ > 0x00007f159c0255b0, 0x00007f159c1babc0, 0x00007f159c1bc180, 0x00007f159c1c21d0, > 0x00007f159c1c36a0, 0x00007f159c1c4bb0, 0x00007f159c1c6730, 0x00007f159c1c7db0, > 0x00007f159c1c9330, 0x00007f159c1fc300, 0x00007f159c211a60, 0x00007f159c213b60, > 0x00007f159c302960, 0x00007f14cc0319d0, 0x00007f14cc0375c0, 0x00007f159c307e80, > 0x00007f159c30db30, 0x00007f159c3e6db0, 0x00007f159c647300, 0x00007f159c64b600, > 0x00007f159c678910, 0x00007f1504000be0 > } > > "main" #1 prio=5 os_prio=0 cpu=766.48ms elapsed=23.73s tid=0x00007f159c0255b0 nid=115919 in Object.wait() [0x00007f15a3e58000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.read(jdk.internal.le at 19-internal/NonBlockingInputStreamImpl.java:139) > - locked <0x00000000a2000368> (a jdk.internal.jshell.tool.ConsoleIOContext$1) > at jdk.internal.org.jline.utils.NonBlockingInputStream.read(jdk.internal.le at 19-internal/NonBlockingInputStream.java:62) > at jdk.internal.org.jline.utils.NonBlocking$NonBlockingInputStreamReader.read(jdk.internal.le at 19-internal/NonBlocking.java:168) > at jdk.internal.org.jline.utils.NonBlockingReader.read(jdk.internal.le at 19-internal/NonBlockingReader.java:57) > at jdk.internal.org.jline.keymap.BindingReader.readCharacter(jdk.internal.le at 19-internal/BindingReader.java:160) > at jdk.internal.org.jline.keymap.BindingReader.readBinding(jdk.internal.le at 19-internal/BindingReader.java:110) > at jdk.internal.org.jline.keymap.BindingReader.readBinding(jdk.internal.le at 19-internal/BindingReader.java:61) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.doReadBinding(jdk.internal.le at 19-internal/LineReaderImpl.java:923) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.readBinding(jdk.internal.le at 19-internal/LineReaderImpl.java:956) > at jdk.internal.jshell.tool.ConsoleIOContext$2.readBinding(jdk.jshell at 19-internal/ConsoleIOContext.java:173) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.readLine(jdk.internal.le at 19-internal/LineReaderImpl.java:651) > at jdk.internal.org.jline.reader.impl.LineReaderImpl.readLine(jdk.internal.le at 19-internal/LineReaderImpl.java:468) > at jdk.internal.jshell.tool.ConsoleIOContext.readLine(jdk.jshell at 19-internal/ConsoleIOContext.java:249) > at jdk.internal.jshell.tool.JShellTool.getInput(jdk.jshell at 19-internal/JShellTool.java:1281) > at jdk.internal.jshell.tool.JShellTool.run(jdk.jshell at 19-internal/JShellTool.java:1215) > at jdk.internal.jshell.tool.JShellTool.start(jdk.jshell at 19-internal/JShellTool.java:1001) > at jdk.internal.jshell.tool.JShellToolBuilder.start(jdk.jshell at 19-internal/JShellToolBuilder.java:261) > at jdk.internal.jshell.tool.JShellToolProvider.main(jdk.jshell at 19-internal/JShellToolProvider.java:120) > > "Reference Handler" #2 daemon prio=10 os_prio=0 cpu=1.04ms elapsed=23.72s tid=0x00007f159c1babc0 nid=115926 waiting on condition [0x00007f1529302000] > java.lang.Thread.State: RUNNABLE > at java.lang.ref.Reference.waitForReferencePendingList(java.base at 19-internal/Native Method) > at java.lang.ref.Reference.processPendingReferences(java.base at 19-internal/Reference.java:253) > at java.lang.ref.Reference$ReferenceHandler.run(java.base at 19-internal/Reference.java:215) > > "Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.35ms elapsed=23.72s tid=0x00007f159c1bc180 nid=115927 in Object.wait() [0x00007f1529201000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on <0x00000000a00002e8> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) > - locked <0x00000000a00002e8> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:176) > at java.lang.ref.Finalizer$FinalizerThread.run(java.base at 19-internal/Finalizer.java:183) > > "Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.24ms elapsed=23.71s tid=0x00007f159c1c21d0 nid=115928 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.20ms elapsed=23.71s tid=0x00007f159c1c36a0 nid=115929 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=0.24ms elapsed=23.71s tid=0x00007f159c1c4bb0 nid=115930 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=527.71ms elapsed=23.71s tid=0x00007f159c1c6730 nid=115931 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > No compile task > > "C1 CompilerThread0" #19 daemon prio=9 os_prio=0 cpu=216.87ms elapsed=23.71s tid=0x00007f159c1c7db0 nid=115932 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > No compile task > > "Sweeper thread" #25 daemon prio=9 os_prio=0 cpu=0.04ms elapsed=23.71s tid=0x00007f159c1c9330 nid=115933 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Notification Thread" #26 daemon prio=9 os_prio=0 cpu=0.05ms elapsed=23.70s tid=0x00007f159c1fc300 nid=115936 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Common-Cleaner" #27 daemon prio=8 os_prio=0 cpu=0.99ms elapsed=23.68s tid=0x00007f159c211a60 nid=115938 in Object.wait() [0x00007f15282d2000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) > - locked <0x00000000a0000628> (a java.lang.ref.ReferenceQueue$Lock) > at jdk.internal.ref.CleanerImpl.run(java.base at 19-internal/CleanerImpl.java:140) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > at jdk.internal.misc.InnocuousThread.run(java.base at 19-internal/InnocuousThread.java:162) > > "Timer-0" #28 daemon prio=5 os_prio=0 cpu=0.10ms elapsed=23.68s tid=0x00007f159c213b60 nid=115939 in Object.wait() [0x00007f15281d1000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on <0x00000000a00007e8> (a java.util.TaskQueue) > at java.util.TimerThread.mainLoop(java.base at 19-internal/Timer.java:563) > - locked <0x00000000a00007e8> (a java.util.TaskQueue) > at java.util.TimerThread.run(java.base at 19-internal/Timer.java:516) > > "process reaper" #30 daemon prio=10 os_prio=0 cpu=0.17ms elapsed=23.53s tid=0x00007f159c302960 nid=115944 runnable [0x00007f15a3c9e000] > java.lang.Thread.State: RUNNABLE > at java.lang.ProcessHandleImpl.waitForProcessExit0(java.base at 19-internal/Native Method) > at java.lang.ProcessHandleImpl$1.run(java.base at 19-internal/ProcessHandleImpl.java:147) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1136) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "JDI Internal Event Handler" #34 daemon prio=5 os_prio=0 cpu=14.16ms elapsed=23.49s tid=0x00007f14cc0319d0 nid=115971 in Object.wait() [0x00007f15096b7000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.Object.wait(java.base at 19-internal/Object.java:338) > at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(jdk.jdi at 19-internal/EventQueueImpl.java:190) > - locked <0x00000000a0000e80> (a com.sun.tools.jdi.EventQueueImpl) > at com.sun.tools.jdi.EventQueueImpl.removeInternal(jdk.jdi at 19-internal/EventQueueImpl.java:125) > at com.sun.tools.jdi.InternalEventHandler.run(jdk.jdi at 19-internal/InternalEventHandler.java:61) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "JDI Target VM Interface" #33 daemon prio=5 os_prio=0 cpu=16.02ms elapsed=23.49s tid=0x00007f14cc0375c0 nid=115972 runnable [0x00007f15095b6000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.SocketDispatcher.read0(java.base at 19-internal/Native Method) > at sun.nio.ch.SocketDispatcher.read(java.base at 19-internal/SocketDispatcher.java:47) > at sun.nio.ch.NioSocketImpl.tryRead(java.base at 19-internal/NioSocketImpl.java:258) > at sun.nio.ch.NioSocketImpl.implRead(java.base at 19-internal/NioSocketImpl.java:309) > at sun.nio.ch.NioSocketImpl.read(java.base at 19-internal/NioSocketImpl.java:347) > at sun.nio.ch.NioSocketImpl$1.read(java.base at 19-internal/NioSocketImpl.java:800) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:966) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:961) > at com.sun.tools.jdi.SocketConnection.readPacket(jdk.jdi at 19-internal/SocketConnection.java:82) > - locked <0x00000000a00012e0> (a java.lang.Object) > at com.sun.tools.jdi.TargetVM.run(jdk.jdi at 19-internal/TargetVM.java:123) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "event-handler" #35 daemon prio=5 os_prio=0 cpu=8.29ms elapsed=23.48s tid=0x00007f159c307e80 nid=115973 in Object.wait() [0x00007f15098b9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.Object.wait(java.base at 19-internal/Object.java:338) > at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(jdk.jdi at 19-internal/EventQueueImpl.java:190) > - locked <0x00000000a0001e98> (a com.sun.tools.jdi.EventQueueImpl) > at com.sun.tools.jdi.EventQueueImpl.remove(jdk.jdi at 19-internal/EventQueueImpl.java:97) > at com.sun.tools.jdi.EventQueueImpl.remove(jdk.jdi at 19-internal/EventQueueImpl.java:83) > at jdk.jshell.execution.JdiEventHandler.run(jdk.jshell at 19-internal/JdiEventHandler.java:79) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "output reader" #36 daemon prio=5 os_prio=0 cpu=0.30ms elapsed=23.44s tid=0x00007f159c30db30 nid=115990 runnable [0x00007f15097b8000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.SocketDispatcher.read0(java.base at 19-internal/Native Method) > at sun.nio.ch.SocketDispatcher.read(java.base at 19-internal/SocketDispatcher.java:47) > at sun.nio.ch.NioSocketImpl.tryRead(java.base at 19-internal/NioSocketImpl.java:258) > at sun.nio.ch.NioSocketImpl.implRead(java.base at 19-internal/NioSocketImpl.java:309) > at sun.nio.ch.NioSocketImpl.read(java.base at 19-internal/NioSocketImpl.java:347) > at sun.nio.ch.NioSocketImpl$1.read(java.base at 19-internal/NioSocketImpl.java:800) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:966) > at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:961) > at java.io.FilterInputStream.read(java.base at 19-internal/FilterInputStream.java:79) > at jdk.jshell.execution.DemultiplexInput.run(jdk.jshell at 19-internal/DemultiplexInput.java:60) > > "Thread-1" #38 daemon prio=5 os_prio=0 cpu=222.19ms elapsed=23.37s tid=0x00007f159c3e6db0 nid=115995 waiting on condition [0x00007f15094b5000] > java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base at 19-internal/Native Method) > - parking to wait for <0x00000000a007a880> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(java.base at 19-internal/LockSupport.java:341) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(java.base at 19-internal/AbstractQueuedSynchronizer.java:506) > at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base at 19-internal/ForkJoinPool.java:3464) > at java.util.concurrent.ForkJoinPool.managedBlock(java.base at 19-internal/ForkJoinPool.java:3435) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base at 19-internal/AbstractQueuedSynchronizer.java:1623) > at java.util.concurrent.LinkedBlockingQueue.take(java.base at 19-internal/LinkedBlockingQueue.java:435) > at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 19-internal/ThreadPoolExecutor.java:1062) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1122) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "process reaper" #40 daemon prio=10 os_prio=0 cpu=1.83ms elapsed=22.94s tid=0x00007f159c647300 nid=116044 waiting on condition [0x00007f15280d0000] > java.lang.Thread.State: TIMED_WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base at 19-internal/Native Method) > - parking to wait for <0x00000000a0079fb0> (a java.util.concurrent.SynchronousQueue$TransferStack) > at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 19-internal/LockSupport.java:252) > at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 19-internal/SynchronousQueue.java:401) > at java.util.concurrent.SynchronousQueue.poll(java.base at 19-internal/SynchronousQueue.java:903) > at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 19-internal/ThreadPoolExecutor.java:1061) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1122) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "Thread-3" #41 daemon prio=5 os_prio=0 cpu=1.39ms elapsed=22.94s tid=0x00007f159c64b600 nid=116045 runnable [0x00007f14734ed000] > java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.read0(java.base at 19-internal/Native Method) > at java.io.FileInputStream.read(java.base at 19-internal/FileInputStream.java:228) > at jdk.internal.org.jline.terminal.impl.AbstractPty$PtyInputStream.read(jdk.internal.le at 19-internal/AbstractPty.java:73) > at jdk.internal.org.jline.utils.NonBlockingInputStream.read(jdk.internal.le at 19-internal/NonBlockingInputStream.java:62) > at jdk.internal.jshell.tool.StopDetectingInputStream.lambda$setInputStream$0(jdk.jshell at 19-internal/StopDetectingInputStream.java:74) > at jdk.internal.jshell.tool.StopDetectingInputStream$$Lambda$355/0x0000000800dd68d0.run(jdk.jshell at 19-internal/Unknown Source) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "null non blocking reader thread" #43 daemon prio=5 os_prio=0 cpu=0.15ms elapsed=22.78s tid=0x00007f159c678910 nid=116063 in Object.wait() [0x00007f1472ee6000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base at 19-internal/Native Method) > - waiting on > at java.lang.Object.wait(java.base at 19-internal/Object.java:338) > at jdk.internal.jshell.tool.StopDetectingInputStream.read(jdk.jshell at 19-internal/StopDetectingInputStream.java:111) > - locked <0x00000000a2000908> (a jdk.internal.jshell.tool.StopDetectingInputStream) > at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.run(jdk.internal.le at 19-internal/NonBlockingInputStreamImpl.java:216) > at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl$$Lambda$530/0x0000000800dfde40.run(jdk.internal.le at 19-internal/Unknown Source) > at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) > > "Attach Listener" #44 daemon prio=9 os_prio=0 cpu=0.17ms elapsed=0.10s tid=0x00007f1504000be0 nid=116383 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "VM Thread" os_prio=0 cpu=8.31ms elapsed=23.72s tid=0x00007f159c1b78f0 nid=115925 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0xf21ec5] VM_PrintThreads::doit()+0x25 > V [libjvm.so+0xf225aa] VM_Operation::evaluate()+0xea > V [libjvm.so+0xf23f18] VMThread::evaluate_operation(VM_Operation*)+0xb8 > V [libjvm.so+0xf245c7] VMThread::inner_execute(VM_Operation*)+0x3a7 > V [libjvm.so+0xf24877] VMThread::run()+0xb7 > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > "GC Thread#0" os_prio=0 cpu=39.19ms elapsed=23.73s tid=0x00007f159c066a10 nid=115920 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#1" os_prio=0 cpu=9.75ms elapsed=23.16s tid=0x00007f15180140e0 nid=116014 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#2" os_prio=0 cpu=0.48ms elapsed=23.15s tid=0x00007f151801b730 nid=116015 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#3" os_prio=0 cpu=11.09ms elapsed=23.15s tid=0x00007f151801c160 nid=116016 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#4" os_prio=0 cpu=24.95ms elapsed=23.15s tid=0x00007f151801ccc0 nid=116017 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#5" os_prio=0 cpu=0.58ms elapsed=23.15s tid=0x00007f151801d820 nid=116018 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#6" os_prio=0 cpu=35.58ms elapsed=23.15s tid=0x00007f151801e380 nid=116019 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#7" os_prio=0 cpu=13.88ms elapsed=22.98s tid=0x00007f151801b050 nid=116042 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "GC Thread#8" os_prio=0 cpu=12.97ms elapsed=22.92s tid=0x00007f1518021950 nid=116047 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Main Marker" os_prio=0 cpu=0.84ms elapsed=23.73s tid=0x00007f159c076ed0 nid=115921 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xba33] pthread_cond_wait+0xc3 > V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 > V [libjvm.so+0x725fda] G1ConcurrentMarkThread::wait_for_next_cycle()+0x3a > V [libjvm.so+0x7270bb] G1ConcurrentMarkThread::run_service()+0xdb > V [libjvm.so+0x60be0b] ConcurrentGCThread::run()+0x1b > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > "G1 Conc#0" os_prio=0 cpu=6.39ms elapsed=23.73s tid=0x00007f159c077f50 nid=115922 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#1" os_prio=0 cpu=1.74ms elapsed=23.05s tid=0x00007f152c000960 nid=116022 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#2" os_prio=0 cpu=1.23ms elapsed=23.05s tid=0x00007f152c001490 nid=116023 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#3" os_prio=0 cpu=2.04ms elapsed=23.05s tid=0x00007f152c001ff0 nid=116024 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#4" os_prio=0 cpu=1.28ms elapsed=23.05s tid=0x00007f152c002b50 nid=116025 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#5" os_prio=0 cpu=1.60ms elapsed=23.05s tid=0x00007f152c0036b0 nid=116026 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#6" os_prio=0 cpu=1.61ms elapsed=23.05s tid=0x00007f152c004210 nid=116027 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#7" os_prio=0 cpu=2.00ms elapsed=23.05s tid=0x00007f152c005160 nid=116028 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#8" os_prio=0 cpu=1.58ms elapsed=23.05s tid=0x00007f152c0060b0 nid=116029 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#9" os_prio=0 cpu=1.26ms elapsed=23.05s tid=0x00007f152c007000 nid=116030 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#10" os_prio=0 cpu=1.61ms elapsed=23.05s tid=0x00007f152c007f50 nid=116031 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#11" os_prio=0 cpu=1.52ms elapsed=23.05s tid=0x00007f152c008ea0 nid=116032 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#12" os_prio=0 cpu=1.17ms elapsed=23.05s tid=0x00007f152c009df0 nid=116033 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#13" os_prio=0 cpu=1.88ms elapsed=23.05s tid=0x00007f152c00ad40 nid=116034 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#14" os_prio=0 cpu=1.31ms elapsed=23.05s tid=0x00007f152c00bc90 nid=116035 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Conc#15" os_prio=0 cpu=1.52ms elapsed=23.05s tid=0x00007f152c00cbe0 nid=116036 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Refine#0" os_prio=0 cpu=0.03ms elapsed=23.73s tid=0x00007f159c1881d0 nid=115923 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > "G1 Service" os_prio=0 cpu=0.72ms elapsed=23.73s tid=0x00007f159c189290 nid=115924 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xbde2] pthread_cond_timedwait+0x132 > V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 > V [libjvm.so+0x77d208] G1ServiceThread::wait_for_task()+0xf8 > V [libjvm.so+0x77d600] G1ServiceThread::run_service()+0x20 > V [libjvm.so+0x60be0b] ConcurrentGCThread::run()+0x1b > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > "VM Periodic Task Thread" os_prio=0 cpu=4.05ms elapsed=23.71s tid=0x00007f159c1fddf0 nid=115937 waiting on condition > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xbde2] pthread_cond_timedwait+0x132 > V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 > V [libjvm.so+0xc07015] WatcherThread::sleep() const+0xa5 > V [libjvm.so+0xc070e5] WatcherThread::run()+0x35 > V [libjvm.so+0xe99770] Thread::call_run()+0xc0 > V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 > > JNI global refs: 28, weak refs: 0 Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8283147: Include NonJavaThread stacktrace during thread dump ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7833/files - new: https://git.openjdk.java.net/jdk/pull/7833/files/ab1b879c..5e19e852 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7833&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7833&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7833.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7833/head:pull/7833 PR: https://git.openjdk.java.net/jdk/pull/7833 From iklam at openjdk.java.net Wed Mar 16 03:17:42 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Mar 2022 03:17:42 GMT Subject: Integrated: 8253495: CDS generates non-deterministic output In-Reply-To: References: Message-ID: On Tue, 8 Mar 2022 19:11:02 GMT, Ioi Lam wrote: > This patch makes the result of "java -Xshare:dump" deterministic: > - Disabled new Java threads from launching. This is harmless. See comments in jvm.cpp > - Fixed a problem in hashtable ordering in heapShared.cpp > - BasicHashtableEntry has a gap on 64-bit platforms that may contain random bits. Added code to zero it. > - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in make/scripts/compare.sh > > Note: $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. This will be fixed in [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828). > > Testing under way: > - tier1~tier5 > - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, windows-x86-cmp-baseline, .... etc). This pull request has now been integrated. Changeset: de4f04cb Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/de4f04cb71a26ce03b96460cb8d1c1e28cd1ed38 Stats: 100 lines in 15 files changed: 69 ins; 9 del; 22 mod 8253495: CDS generates non-deterministic output Reviewed-by: erikj, kbarrett, ccheung, ihse ------------- PR: https://git.openjdk.java.net/jdk/pull/7748 From duke at openjdk.java.net Wed Mar 16 05:55:18 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Wed, 16 Mar 2022 05:55:18 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v2] In-Reply-To: References: Message-ID: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> > Hi, > > This patch improves the generation of broadcasting a scalar in several ways: > > - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. > - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. > - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. > > This patch also removes some redundant code paths and rename some incorrectly named instructions. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: fix crash in sse ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7832/files - new: https://git.openjdk.java.net/jdk/pull/7832/files/0706aa56..8216d790 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From yyang at openjdk.java.net Wed Mar 16 07:52:47 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 16 Mar 2022 07:52:47 GMT Subject: RFR: 8283147: Include NonJavaThread stacktrace during thread dump [v3] In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 02:57:20 GMT, Yi Yang wrote: >> When we use jcmd Thread.dump/jstack , we could dump all Java thread stack trace, but unfortunately we are not able to print NonJavaThread stack trace such as VMThread/GCWorker, etc. For these threads, we know nothing about what are they doing/are they blocked in pthread condition from jstack output. An alternative is to use pstack, it internally attaches destination process and uses `thread apply all bt`, which introduces more overhead and much more dangerous. >> >> ====== JStack Ouput(Currrent) >> >> ... >> "ApplicationImpl pooled thread 441" #1478 prio=4 os_prio=31 cpu=11.71ms elapsed=60.30s tid=0x00007f8d32171000 nid=0x22f23 waiting on condition [0x0000700010d5d000] >> java.lang.Thread.State: TIMED_WAITING (parking) >> at jdk.internal.misc.Unsafe.park(java.base at 11.0.11/Native Method) >> - parking to wait for <0x00000007af851760> (a java.util.concurrent.SynchronousQueue$TransferStack) >> at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 11.0.11/LockSupport.java:234) >> at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base at 11.0.11/SynchronousQueue.java:462) >> at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 11.0.11/SynchronousQueue.java:361) >> at java.util.concurrent.SynchronousQueue.poll(java.base at 11.0.11/SynchronousQueue.java:937) >> at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 11.0.11/ThreadPoolExecutor.java:1053) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 11.0.11/ThreadPoolExecutor.java:1114) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 11.0.11/ThreadPoolExecutor.java:628) >> at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:668) >> at java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(java.base at 11.0.11/Executors.java:665) >> at java.security.AccessController.doPrivileged(java.base at 11.0.11/Native Method) >> at java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(java.base at 11.0.11/Executors.java:665) >> at java.lang.Thread.run(java.base at 11.0.11/Thread.java:829) >> >> "VM Thread" os_prio=31 cpu=31205.83ms elapsed=154131.15s tid=0x00007f8d49046000 nid=0x4703 runnable >> >> "GC Thread#0" os_prio=31 cpu=3811.96ms elapsed=154131.18s tid=0x00007f8d49809800 nid=0x3603 runnable >> >> "GC Thread#1" os_prio=31 cpu=3749.09ms elapsed=154130.24s tid=0x00007f8d4a9b3000 nid=0x6103 runnable >> >> "GC Thread#2" os_prio=31 cpu=3745.73ms elapsed=154129.74s tid=0x00007f8d48249000 nid=0x12f27 runnable >> >> "GC Thread#3" os_prio=31 cpu=3692.77ms elapsed=154129.74s tid=0x00007f8d48b93000 nid=0xe50b runnable >> >> "GC Thread#4" os_prio=31 cpu=3728.57ms elapsed=154129.74s tid=0x00007f8d47b0b000 nid=0xe603 runnable >> >> "GC Thread#5" os_prio=31 cpu=3726.08ms elapsed=154129.74s tid=0x00007f8d47afc800 nid=0xe803 runnable >> >> "GC Thread#6" os_prio=31 cpu=3660.35ms elapsed=154129.02s tid=0x00007f8d48de5800 nid=0x15d2f runnable >> >> "GC Thread#7" os_prio=31 cpu=3676.68ms elapsed=154129.02s tid=0x00007f8d48dc4800 nid=0x16103 runnable >> >> "GC Thread#8" os_prio=31 cpu=3676.15ms elapsed=154128.31s tid=0x00007f8d4849d800 nid=0x1f503 runnable >> >> "GC Thread#9" os_prio=31 cpu=3570.95ms elapsed=154128.31s tid=0x00007f8d494ab000 nid=0x1f303 runnable >> >> "CMS Main Thread" os_prio=31 cpu=6715.33ms elapsed=154131.18s tid=0x00007f8d4780f800 nid=0x4b03 runnable >> >> "CMS Thread#0" os_prio=31 cpu=2429.86ms elapsed=154131.18s tid=0x00007f8d4900e000 nid=0x3703 runnable >> >> "CMS Thread#1" os_prio=31 cpu=2422.35ms elapsed=154129.72s tid=0x00007f8d4d044000 nid=0x11a03 runnable >> >> "CMS Thread#2" os_prio=31 cpu=2418.81ms elapsed=154129.72s tid=0x00007f8d48b93800 nid=0xea03 runnable >> >> "VM Periodic Task Thread" os_prio=31 cpu=10658.80ms elapsed=154130.41s tid=0x00007f8d49035000 nid=0xa003 waiting on condition >> >> JNI global refs: 660, weak refs: 1217 >> >> >> Most of above information makes no sense for further debugging. I think we can extend this functionality, e.g. add a new flag such as DumpAllThreadStackTrace, to print non java thread stack trace: >> >> ====== JStack Ouput(Modified) >> >> 2022-03-16 10:46:55 >> Full thread dump OpenJDK 64-Bit Server VM (19-internal-adhoc.qingfengyy.jdktip mixed mode, sharing): >> >> Threads class SMR info: >> _java_thread_list=0x00007f15040015f0, length=22, elements={ >> 0x00007f159c0255b0, 0x00007f159c1babc0, 0x00007f159c1bc180, 0x00007f159c1c21d0, >> 0x00007f159c1c36a0, 0x00007f159c1c4bb0, 0x00007f159c1c6730, 0x00007f159c1c7db0, >> 0x00007f159c1c9330, 0x00007f159c1fc300, 0x00007f159c211a60, 0x00007f159c213b60, >> 0x00007f159c302960, 0x00007f14cc0319d0, 0x00007f14cc0375c0, 0x00007f159c307e80, >> 0x00007f159c30db30, 0x00007f159c3e6db0, 0x00007f159c647300, 0x00007f159c64b600, >> 0x00007f159c678910, 0x00007f1504000be0 >> } >> >> "main" #1 prio=5 os_prio=0 cpu=766.48ms elapsed=23.73s tid=0x00007f159c0255b0 nid=115919 in Object.wait() [0x00007f15a3e58000] >> java.lang.Thread.State: TIMED_WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on >> at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.read(jdk.internal.le at 19-internal/NonBlockingInputStreamImpl.java:139) >> - locked <0x00000000a2000368> (a jdk.internal.jshell.tool.ConsoleIOContext$1) >> at jdk.internal.org.jline.utils.NonBlockingInputStream.read(jdk.internal.le at 19-internal/NonBlockingInputStream.java:62) >> at jdk.internal.org.jline.utils.NonBlocking$NonBlockingInputStreamReader.read(jdk.internal.le at 19-internal/NonBlocking.java:168) >> at jdk.internal.org.jline.utils.NonBlockingReader.read(jdk.internal.le at 19-internal/NonBlockingReader.java:57) >> at jdk.internal.org.jline.keymap.BindingReader.readCharacter(jdk.internal.le at 19-internal/BindingReader.java:160) >> at jdk.internal.org.jline.keymap.BindingReader.readBinding(jdk.internal.le at 19-internal/BindingReader.java:110) >> at jdk.internal.org.jline.keymap.BindingReader.readBinding(jdk.internal.le at 19-internal/BindingReader.java:61) >> at jdk.internal.org.jline.reader.impl.LineReaderImpl.doReadBinding(jdk.internal.le at 19-internal/LineReaderImpl.java:923) >> at jdk.internal.org.jline.reader.impl.LineReaderImpl.readBinding(jdk.internal.le at 19-internal/LineReaderImpl.java:956) >> at jdk.internal.jshell.tool.ConsoleIOContext$2.readBinding(jdk.jshell at 19-internal/ConsoleIOContext.java:173) >> at jdk.internal.org.jline.reader.impl.LineReaderImpl.readLine(jdk.internal.le at 19-internal/LineReaderImpl.java:651) >> at jdk.internal.org.jline.reader.impl.LineReaderImpl.readLine(jdk.internal.le at 19-internal/LineReaderImpl.java:468) >> at jdk.internal.jshell.tool.ConsoleIOContext.readLine(jdk.jshell at 19-internal/ConsoleIOContext.java:249) >> at jdk.internal.jshell.tool.JShellTool.getInput(jdk.jshell at 19-internal/JShellTool.java:1281) >> at jdk.internal.jshell.tool.JShellTool.run(jdk.jshell at 19-internal/JShellTool.java:1215) >> at jdk.internal.jshell.tool.JShellTool.start(jdk.jshell at 19-internal/JShellTool.java:1001) >> at jdk.internal.jshell.tool.JShellToolBuilder.start(jdk.jshell at 19-internal/JShellToolBuilder.java:261) >> at jdk.internal.jshell.tool.JShellToolProvider.main(jdk.jshell at 19-internal/JShellToolProvider.java:120) >> >> "Reference Handler" #2 daemon prio=10 os_prio=0 cpu=1.04ms elapsed=23.72s tid=0x00007f159c1babc0 nid=115926 waiting on condition [0x00007f1529302000] >> java.lang.Thread.State: RUNNABLE >> at java.lang.ref.Reference.waitForReferencePendingList(java.base at 19-internal/Native Method) >> at java.lang.ref.Reference.processPendingReferences(java.base at 19-internal/Reference.java:253) >> at java.lang.ref.Reference$ReferenceHandler.run(java.base at 19-internal/Reference.java:215) >> >> "Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.35ms elapsed=23.72s tid=0x00007f159c1bc180 nid=115927 in Object.wait() [0x00007f1529201000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on <0x00000000a00002e8> (a java.lang.ref.ReferenceQueue$Lock) >> at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) >> - locked <0x00000000a00002e8> (a java.lang.ref.ReferenceQueue$Lock) >> at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:176) >> at java.lang.ref.Finalizer$FinalizerThread.run(java.base at 19-internal/Finalizer.java:183) >> >> "Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.24ms elapsed=23.71s tid=0x00007f159c1c21d0 nid=115928 waiting on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.20ms elapsed=23.71s tid=0x00007f159c1c36a0 nid=115929 runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=0.24ms elapsed=23.71s tid=0x00007f159c1c4bb0 nid=115930 runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=527.71ms elapsed=23.71s tid=0x00007f159c1c6730 nid=115931 waiting on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> No compile task >> >> "C1 CompilerThread0" #19 daemon prio=9 os_prio=0 cpu=216.87ms elapsed=23.71s tid=0x00007f159c1c7db0 nid=115932 waiting on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> No compile task >> >> "Sweeper thread" #25 daemon prio=9 os_prio=0 cpu=0.04ms elapsed=23.71s tid=0x00007f159c1c9330 nid=115933 runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "Notification Thread" #26 daemon prio=9 os_prio=0 cpu=0.05ms elapsed=23.70s tid=0x00007f159c1fc300 nid=115936 runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "Common-Cleaner" #27 daemon prio=8 os_prio=0 cpu=0.99ms elapsed=23.68s tid=0x00007f159c211a60 nid=115938 in Object.wait() [0x00007f15282d2000] >> java.lang.Thread.State: TIMED_WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on >> at java.lang.ref.ReferenceQueue.remove(java.base at 19-internal/ReferenceQueue.java:155) >> - locked <0x00000000a0000628> (a java.lang.ref.ReferenceQueue$Lock) >> at jdk.internal.ref.CleanerImpl.run(java.base at 19-internal/CleanerImpl.java:140) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> at jdk.internal.misc.InnocuousThread.run(java.base at 19-internal/InnocuousThread.java:162) >> >> "Timer-0" #28 daemon prio=5 os_prio=0 cpu=0.10ms elapsed=23.68s tid=0x00007f159c213b60 nid=115939 in Object.wait() [0x00007f15281d1000] >> java.lang.Thread.State: TIMED_WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on <0x00000000a00007e8> (a java.util.TaskQueue) >> at java.util.TimerThread.mainLoop(java.base at 19-internal/Timer.java:563) >> - locked <0x00000000a00007e8> (a java.util.TaskQueue) >> at java.util.TimerThread.run(java.base at 19-internal/Timer.java:516) >> >> "process reaper" #30 daemon prio=10 os_prio=0 cpu=0.17ms elapsed=23.53s tid=0x00007f159c302960 nid=115944 runnable [0x00007f15a3c9e000] >> java.lang.Thread.State: RUNNABLE >> at java.lang.ProcessHandleImpl.waitForProcessExit0(java.base at 19-internal/Native Method) >> at java.lang.ProcessHandleImpl$1.run(java.base at 19-internal/ProcessHandleImpl.java:147) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1136) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "JDI Internal Event Handler" #34 daemon prio=5 os_prio=0 cpu=14.16ms elapsed=23.49s tid=0x00007f14cc0319d0 nid=115971 in Object.wait() [0x00007f15096b7000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on >> at java.lang.Object.wait(java.base at 19-internal/Object.java:338) >> at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(jdk.jdi at 19-internal/EventQueueImpl.java:190) >> - locked <0x00000000a0000e80> (a com.sun.tools.jdi.EventQueueImpl) >> at com.sun.tools.jdi.EventQueueImpl.removeInternal(jdk.jdi at 19-internal/EventQueueImpl.java:125) >> at com.sun.tools.jdi.InternalEventHandler.run(jdk.jdi at 19-internal/InternalEventHandler.java:61) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "JDI Target VM Interface" #33 daemon prio=5 os_prio=0 cpu=16.02ms elapsed=23.49s tid=0x00007f14cc0375c0 nid=115972 runnable [0x00007f15095b6000] >> java.lang.Thread.State: RUNNABLE >> at sun.nio.ch.SocketDispatcher.read0(java.base at 19-internal/Native Method) >> at sun.nio.ch.SocketDispatcher.read(java.base at 19-internal/SocketDispatcher.java:47) >> at sun.nio.ch.NioSocketImpl.tryRead(java.base at 19-internal/NioSocketImpl.java:258) >> at sun.nio.ch.NioSocketImpl.implRead(java.base at 19-internal/NioSocketImpl.java:309) >> at sun.nio.ch.NioSocketImpl.read(java.base at 19-internal/NioSocketImpl.java:347) >> at sun.nio.ch.NioSocketImpl$1.read(java.base at 19-internal/NioSocketImpl.java:800) >> at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:966) >> at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:961) >> at com.sun.tools.jdi.SocketConnection.readPacket(jdk.jdi at 19-internal/SocketConnection.java:82) >> - locked <0x00000000a00012e0> (a java.lang.Object) >> at com.sun.tools.jdi.TargetVM.run(jdk.jdi at 19-internal/TargetVM.java:123) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "event-handler" #35 daemon prio=5 os_prio=0 cpu=8.29ms elapsed=23.48s tid=0x00007f159c307e80 nid=115973 in Object.wait() [0x00007f15098b9000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on >> at java.lang.Object.wait(java.base at 19-internal/Object.java:338) >> at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(jdk.jdi at 19-internal/EventQueueImpl.java:190) >> - locked <0x00000000a0001e98> (a com.sun.tools.jdi.EventQueueImpl) >> at com.sun.tools.jdi.EventQueueImpl.remove(jdk.jdi at 19-internal/EventQueueImpl.java:97) >> at com.sun.tools.jdi.EventQueueImpl.remove(jdk.jdi at 19-internal/EventQueueImpl.java:83) >> at jdk.jshell.execution.JdiEventHandler.run(jdk.jshell at 19-internal/JdiEventHandler.java:79) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "output reader" #36 daemon prio=5 os_prio=0 cpu=0.30ms elapsed=23.44s tid=0x00007f159c30db30 nid=115990 runnable [0x00007f15097b8000] >> java.lang.Thread.State: RUNNABLE >> at sun.nio.ch.SocketDispatcher.read0(java.base at 19-internal/Native Method) >> at sun.nio.ch.SocketDispatcher.read(java.base at 19-internal/SocketDispatcher.java:47) >> at sun.nio.ch.NioSocketImpl.tryRead(java.base at 19-internal/NioSocketImpl.java:258) >> at sun.nio.ch.NioSocketImpl.implRead(java.base at 19-internal/NioSocketImpl.java:309) >> at sun.nio.ch.NioSocketImpl.read(java.base at 19-internal/NioSocketImpl.java:347) >> at sun.nio.ch.NioSocketImpl$1.read(java.base at 19-internal/NioSocketImpl.java:800) >> at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:966) >> at java.net.Socket$SocketInputStream.read(java.base at 19-internal/Socket.java:961) >> at java.io.FilterInputStream.read(java.base at 19-internal/FilterInputStream.java:79) >> at jdk.jshell.execution.DemultiplexInput.run(jdk.jshell at 19-internal/DemultiplexInput.java:60) >> >> "Thread-1" #38 daemon prio=5 os_prio=0 cpu=222.19ms elapsed=23.37s tid=0x00007f159c3e6db0 nid=115995 waiting on condition [0x00007f15094b5000] >> java.lang.Thread.State: WAITING (parking) >> at jdk.internal.misc.Unsafe.park(java.base at 19-internal/Native Method) >> - parking to wait for <0x00000000a007a880> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) >> at java.util.concurrent.locks.LockSupport.park(java.base at 19-internal/LockSupport.java:341) >> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(java.base at 19-internal/AbstractQueuedSynchronizer.java:506) >> at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base at 19-internal/ForkJoinPool.java:3464) >> at java.util.concurrent.ForkJoinPool.managedBlock(java.base at 19-internal/ForkJoinPool.java:3435) >> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base at 19-internal/AbstractQueuedSynchronizer.java:1623) >> at java.util.concurrent.LinkedBlockingQueue.take(java.base at 19-internal/LinkedBlockingQueue.java:435) >> at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 19-internal/ThreadPoolExecutor.java:1062) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1122) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "process reaper" #40 daemon prio=10 os_prio=0 cpu=1.83ms elapsed=22.94s tid=0x00007f159c647300 nid=116044 waiting on condition [0x00007f15280d0000] >> java.lang.Thread.State: TIMED_WAITING (parking) >> at jdk.internal.misc.Unsafe.park(java.base at 19-internal/Native Method) >> - parking to wait for <0x00000000a0079fb0> (a java.util.concurrent.SynchronousQueue$TransferStack) >> at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 19-internal/LockSupport.java:252) >> at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base at 19-internal/SynchronousQueue.java:401) >> at java.util.concurrent.SynchronousQueue.poll(java.base at 19-internal/SynchronousQueue.java:903) >> at java.util.concurrent.ThreadPoolExecutor.getTask(java.base at 19-internal/ThreadPoolExecutor.java:1061) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 19-internal/ThreadPoolExecutor.java:1122) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 19-internal/ThreadPoolExecutor.java:635) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "Thread-3" #41 daemon prio=5 os_prio=0 cpu=1.39ms elapsed=22.94s tid=0x00007f159c64b600 nid=116045 runnable [0x00007f14734ed000] >> java.lang.Thread.State: RUNNABLE >> at java.io.FileInputStream.read0(java.base at 19-internal/Native Method) >> at java.io.FileInputStream.read(java.base at 19-internal/FileInputStream.java:228) >> at jdk.internal.org.jline.terminal.impl.AbstractPty$PtyInputStream.read(jdk.internal.le at 19-internal/AbstractPty.java:73) >> at jdk.internal.org.jline.utils.NonBlockingInputStream.read(jdk.internal.le at 19-internal/NonBlockingInputStream.java:62) >> at jdk.internal.jshell.tool.StopDetectingInputStream.lambda$setInputStream$0(jdk.jshell at 19-internal/StopDetectingInputStream.java:74) >> at jdk.internal.jshell.tool.StopDetectingInputStream$$Lambda$355/0x0000000800dd68d0.run(jdk.jshell at 19-internal/Unknown Source) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "null non blocking reader thread" #43 daemon prio=5 os_prio=0 cpu=0.15ms elapsed=22.78s tid=0x00007f159c678910 nid=116063 in Object.wait() [0x00007f1472ee6000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(java.base at 19-internal/Native Method) >> - waiting on >> at java.lang.Object.wait(java.base at 19-internal/Object.java:338) >> at jdk.internal.jshell.tool.StopDetectingInputStream.read(jdk.jshell at 19-internal/StopDetectingInputStream.java:111) >> - locked <0x00000000a2000908> (a jdk.internal.jshell.tool.StopDetectingInputStream) >> at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl.run(jdk.internal.le at 19-internal/NonBlockingInputStreamImpl.java:216) >> at jdk.internal.org.jline.utils.NonBlockingInputStreamImpl$$Lambda$530/0x0000000800dfde40.run(jdk.internal.le at 19-internal/Unknown Source) >> at java.lang.Thread.run(java.base at 19-internal/Thread.java:828) >> >> "Attach Listener" #44 daemon prio=9 os_prio=0 cpu=0.17ms elapsed=0.10s tid=0x00007f1504000be0 nid=116383 waiting on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "VM Thread" os_prio=0 cpu=8.31ms elapsed=23.72s tid=0x00007f159c1b78f0 nid=115925 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0xf21ec5] VM_PrintThreads::doit()+0x25 >> V [libjvm.so+0xf225aa] VM_Operation::evaluate()+0xea >> V [libjvm.so+0xf23f18] VMThread::evaluate_operation(VM_Operation*)+0xb8 >> V [libjvm.so+0xf245c7] VMThread::inner_execute(VM_Operation*)+0x3a7 >> V [libjvm.so+0xf24877] VMThread::run()+0xb7 >> V [libjvm.so+0xe99770] Thread::call_run()+0xc0 >> V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 >> >> "GC Thread#0" os_prio=0 cpu=39.19ms elapsed=23.73s tid=0x00007f159c066a10 nid=115920 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#1" os_prio=0 cpu=9.75ms elapsed=23.16s tid=0x00007f15180140e0 nid=116014 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#2" os_prio=0 cpu=0.48ms elapsed=23.15s tid=0x00007f151801b730 nid=116015 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#3" os_prio=0 cpu=11.09ms elapsed=23.15s tid=0x00007f151801c160 nid=116016 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#4" os_prio=0 cpu=24.95ms elapsed=23.15s tid=0x00007f151801ccc0 nid=116017 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#5" os_prio=0 cpu=0.58ms elapsed=23.15s tid=0x00007f151801d820 nid=116018 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#6" os_prio=0 cpu=35.58ms elapsed=23.15s tid=0x00007f151801e380 nid=116019 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#7" os_prio=0 cpu=13.88ms elapsed=22.98s tid=0x00007f151801b050 nid=116042 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "GC Thread#8" os_prio=0 cpu=12.97ms elapsed=22.92s tid=0x00007f1518021950 nid=116047 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Main Marker" os_prio=0 cpu=0.84ms elapsed=23.73s tid=0x00007f159c076ed0 nid=115921 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xba33] pthread_cond_wait+0xc3 >> V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 >> V [libjvm.so+0x725fda] G1ConcurrentMarkThread::wait_for_next_cycle()+0x3a >> V [libjvm.so+0x7270bb] G1ConcurrentMarkThread::run_service()+0xdb >> V [libjvm.so+0x60be0b] ConcurrentGCThread::run()+0x1b >> V [libjvm.so+0xe99770] Thread::call_run()+0xc0 >> V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 >> >> "G1 Conc#0" os_prio=0 cpu=6.39ms elapsed=23.73s tid=0x00007f159c077f50 nid=115922 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#1" os_prio=0 cpu=1.74ms elapsed=23.05s tid=0x00007f152c000960 nid=116022 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#2" os_prio=0 cpu=1.23ms elapsed=23.05s tid=0x00007f152c001490 nid=116023 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#3" os_prio=0 cpu=2.04ms elapsed=23.05s tid=0x00007f152c001ff0 nid=116024 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#4" os_prio=0 cpu=1.28ms elapsed=23.05s tid=0x00007f152c002b50 nid=116025 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#5" os_prio=0 cpu=1.60ms elapsed=23.05s tid=0x00007f152c0036b0 nid=116026 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#6" os_prio=0 cpu=1.61ms elapsed=23.05s tid=0x00007f152c004210 nid=116027 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#7" os_prio=0 cpu=2.00ms elapsed=23.05s tid=0x00007f152c005160 nid=116028 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#8" os_prio=0 cpu=1.58ms elapsed=23.05s tid=0x00007f152c0060b0 nid=116029 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#9" os_prio=0 cpu=1.26ms elapsed=23.05s tid=0x00007f152c007000 nid=116030 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#10" os_prio=0 cpu=1.61ms elapsed=23.05s tid=0x00007f152c007f50 nid=116031 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#11" os_prio=0 cpu=1.52ms elapsed=23.05s tid=0x00007f152c008ea0 nid=116032 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#12" os_prio=0 cpu=1.17ms elapsed=23.05s tid=0x00007f152c009df0 nid=116033 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#13" os_prio=0 cpu=1.88ms elapsed=23.05s tid=0x00007f152c00ad40 nid=116034 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#14" os_prio=0 cpu=1.31ms elapsed=23.05s tid=0x00007f152c00bc90 nid=116035 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Conc#15" os_prio=0 cpu=1.52ms elapsed=23.05s tid=0x00007f152c00cbe0 nid=116036 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Refine#0" os_prio=0 cpu=0.03ms elapsed=23.73s tid=0x00007f159c1881d0 nid=115923 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 >> >> "G1 Service" os_prio=0 cpu=0.72ms elapsed=23.73s tid=0x00007f159c189290 nid=115924 runnable >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xbde2] pthread_cond_timedwait+0x132 >> V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 >> V [libjvm.so+0x77d208] G1ServiceThread::wait_for_task()+0xf8 >> V [libjvm.so+0x77d600] G1ServiceThread::run_service()+0x20 >> V [libjvm.so+0x60be0b] ConcurrentGCThread::run()+0x1b >> V [libjvm.so+0xe99770] Thread::call_run()+0xc0 >> V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 >> >> "VM Periodic Task Thread" os_prio=0 cpu=4.05ms elapsed=23.71s tid=0x00007f159c1fddf0 nid=115937 waiting on condition >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libpthread.so.0+0xbde2] pthread_cond_timedwait+0x132 >> V [libjvm.so+0xbf1589] Monitor::wait_without_safepoint_check(long)+0x39 >> V [libjvm.so+0xc07015] WatcherThread::sleep() const+0xa5 >> V [libjvm.so+0xc070e5] WatcherThread::run()+0x35 >> V [libjvm.so+0xe99770] Thread::call_run()+0xc0 >> V [libjvm.so+0xc37a08] thread_native_entry(Thread*)+0xd8 >> >> JNI global refs: 28, weak refs: 0 > > Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8283147: Include NonJavaThread stacktrace during thread dump Clarification on why there are some one-line frame: VMError::print_native_stack output "G1 Conc#7" os_prio=0 cpu=2.33ms elapsed=11.39s tid=0x00007f635c004d70 nid=71098 runnable Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 pstack outpout Thread 40 (Thread 0x7f62b3bf7700 (LWP 71098)): #0 0x00007f63d32a6b3b in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x00007f63d32a6bcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00007f63d32a6c6b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 #3 0x00007f63d23f7c32 in PosixSemaphore::wait (this=this at entry=0x7f63cc077e78) at /home/qingfeng.yy/jdktip/src/hotspot/os/posix/semaphore_posix.cpp:65 #4 0x00007f63d265b81b in Semaphore::wait (this=0x7f63cc077e78) at /home/qingfeng.yy/jdktip/src/hotspot/share/runtime/semaphore.hpp:55 #5 WorkerTaskDispatcher::worker_run_task (this=0x7f63cc077e68) at /home/qingfeng.yy/jdktip/src/hotspot/share/gc/shared/workerThread.cpp:60 #6 WorkerThread::run (this=0x7f635c004d70) at /home/qingfeng.yy/jdktip/src/hotspot/share/gc/shared/workerThread.cpp:163 #7 0x00007f63d25aa790 in Thread::call_run (this=this at entry=0x7f635c004d70) at /home/qingfeng.yy/jdktip/src/hotspot/share/runtime/thread.cpp:357 #8 0x00007f63d2348a28 in thread_native_entry (thread=0x7f635c004d70) at /home/qingfeng.yy/jdktip/src/hotspot/os/linux/os_linux.cpp:706 #9 0x00007f63d32a0ea5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007f63d2dc58dd in clone () from /lib64/libc.so.6 The top frame is as follows: C frame (sp=0x00007f6338ca0d90 unextended sp=0x00007f6338ca0d90, fp=0x00007f63cc0667c8, real_fp=0x00007f63cc0667c8, pc=0x00007f63d32a6b39 link=0x0000000900000000) do_futex_wait.constprop don't have a valid link/last_frame_pointer, because libpthread has some novel assembly code: 000000000000db10 : db10: 55 push %rbp db11: 48 89 fd mov %rdi,%rbp db14: 53 push %rbx db15: 48 83 ec 08 sub $0x8,%rsp db19: 8b 5f 08 mov 0x8(%rdi),%ebx db1c: e8 1f 09 00 00 callq e440 <__pthread_enable_asynccancel> db21: 45 31 d2 xor %r10d,%r10d db24: 41 89 c0 mov %eax,%r8d db27: 31 d2 xor %edx,%edx db29: 89 de mov %ebx,%esi db2b: bb ca 00 00 00 mov $0xca,%ebx db30: 48 89 ef mov %rbp,%rdi db33: 40 80 f6 80 xor $0x80,%sil db37: 89 d8 mov %ebx,%eax db39: 0f 05 syscall db3b: 89 c3 mov %eax,%ebx .... 000000000000db80 <__new_sem_wait_slow.constprop.0>: db80: 41 54 push %r12 db82: 48 b8 00 00 00 00 01 movabs $0x100000000,%rax db89: 00 00 00 db8c: 55 push %rbp db8d: 53 push %rbx db8e: 48 89 fb mov %rdi,%rbx db91: 48 83 ec 30 sub $0x30,%rsp db95: f0 48 0f c1 07 lock xadd %rax,(%rdi) db9a: 48 8d 35 5f ff ff ff lea -0xa1(%rip),%rsi # db00 <__sem_wait_cleanup> dba1: 49 bc ff ff ff ff fe movabs $0xfffffffeffffffff,%r12 dba8: ff ff ff dbab: 48 8d 6c 24 10 lea 0x10(%rsp),%rbp dbb0: 48 89 fa mov %rdi,%rdx dbb3: 48 89 04 24 mov %rax,(%rsp) dbb7: 48 89 ef mov %rbp,%rdi dbba: e8 b1 04 00 00 callq e070 <_pthread_cleanup_push> dbbf: 48 8b 04 24 mov (%rsp),%rax dbc3: 85 c0 test %eax,%eax .... So os::is_first_C_frame returns earlier. To support walking pthread library, I don't think it requires huge efforts, though. ------------- PR: https://git.openjdk.java.net/jdk/pull/7833 From tobias.hartmann at oracle.com Wed Mar 16 08:29:26 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 09:29:26 +0100 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov Message-ID: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Hi, I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and caching. He is currently deeply involved in Project Panama, working on the Foreign Function Interface and the Vector API. Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits [2] https://openjdk.java.net/census#hotspot [3] https://openjdk.java.net/groups/#member-vote From tobias.hartmann at oracle.com Wed Mar 16 08:29:55 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 09:29:55 +0100 Subject: CFV: New HotSpot Group Member: Dean Long Message-ID: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Hi, I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that all incoming compiler bugs are properly handled. Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits [2] https://openjdk.java.net/census#hotspot [3] https://openjdk.java.net/groups/#member-vote From tobias.hartmann at oracle.com Wed Mar 16 08:31:10 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 09:31:10 +0100 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: Vote: yes Best regards, Tobias On 16.03.22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From tobias.hartmann at oracle.com Wed Mar 16 08:31:17 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 09:31:17 +0100 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: Vote: yes Best regards, Tobias On 16.03.22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From goetz.lindenmaier at sap.com Wed Mar 16 09:16:44 2022 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 16 Mar 2022 09:16:44 +0000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: Vote: yes Best, Goetz > -----Original Message----- > From: hotspot-dev On Behalf Of > Tobias Hartmann > Sent: Wednesday, March 16, 2022 9:30 AM > To: hotspot-dev Source Developers > Subject: CFV: New HotSpot Group Member: Dean Long > > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and > a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant > contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, > Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging > team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this > nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer- > name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&ty > pe=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From goetz.lindenmaier at sap.com Wed Mar 16 09:16:56 2022 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 16 Mar 2022 09:16:56 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: Vote: yes Best, Goetz. > -----Original Message----- > From: hotspot-dev On Behalf Of > Tobias Hartmann > Sent: Wednesday, March 16, 2022 9:29 AM > To: hotspot-dev Source Developers > Subject: CFV: New HotSpot Group Member: Vladimir Ivanov > > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot > Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle > and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir > worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and > LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the > Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this > nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer- > name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afals > e&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From jesper.wilhelmsson at oracle.com Wed Mar 16 09:37:52 2022 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 16 Mar 2022 09:37:52 +0000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <027AF7B8-C3CB-4F0F-AC0B-010EAD0E004E@oracle.com> Vote: Yes /Jesper > On 16 Mar 2022, at 09:29, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From jesper.wilhelmsson at oracle.com Wed Mar 16 09:38:24 2022 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 16 Mar 2022 09:38:24 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <252B476D-DB27-4436-A554-79A4758F0C40@oracle.com> Vote: Yes /Jesper > On 16 Mar 2022, at 09:29, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From bulasevich at openjdk.java.net Wed Mar 16 09:37:29 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Wed, 16 Mar 2022 09:37:29 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: References: Message-ID: > Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. > > In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. > > As a side effect, the performance of some tests is slightly improved: > ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` > > Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: rename, adding test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7517/files - new: https://git.openjdk.java.net/jdk/pull/7517/files/9cb03540..9650abc9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=04-05 Stats: 0 lines in 1 file changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517 PR: https://git.openjdk.java.net/jdk/pull/7517 From erik.osterlund at oracle.com Wed Mar 16 10:20:47 2022 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 16 Mar 2022 10:20:47 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <03D9AE53-B86B-4E2D-9A85-F1A606CE1EC5@oracle.com> Vote: yes /Erik > On 16 Mar 2022, at 09:29, Tobias Hartmann wrote: > > ?Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From erik.osterlund at oracle.com Wed Mar 16 10:22:03 2022 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 16 Mar 2022 10:22:03 +0000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: Vote: yes /Erik > On 16 Mar 2022, at 09:30, Tobias Hartmann wrote: > > ?Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From martin.doerr at sap.com Wed Mar 16 10:28:19 2022 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 16 Mar 2022 10:28:19 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: Vote: yes Best regards, Martin Von: hotspot-dev im Auftrag von Tobias Hartmann Datum: Mittwoch, 16. M?rz 2022 um 09:30 An: hotspot-dev Source Developers Betreff: CFV: New HotSpot Group Member: Vladimir Ivanov Hi, I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and caching. He is currently deeply involved in Project Panama, working on the Foreign Function Interface and the Vector API. Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits [2] https://openjdk.java.net/census#hotspot [3] https://openjdk.java.net/groups/#member-vote From martin.doerr at sap.com Wed Mar 16 10:28:34 2022 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 16 Mar 2022 10:28:34 +0000 Subject: AW: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: Vote: yes Best regards, Martin Von: hotspot-dev im Auftrag von Tobias Hartmann Datum: Mittwoch, 16. M?rz 2022 um 09:30 An: hotspot-dev Source Developers Betreff: CFV: New HotSpot Group Member: Dean Long Hi, I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that all incoming compiler bugs are properly handled. Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits [2] https://openjdk.java.net/census#hotspot [3] https://openjdk.java.net/groups/#member-vote From volker.simonis at gmail.com Wed Mar 16 10:34:42 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 16 Mar 2022 11:34:42 +0100 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: Vote: yes On Wed, Mar 16, 2022 at 9:29 AM Tobias Hartmann wrote: > > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From volker.simonis at gmail.com Wed Mar 16 10:35:06 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 16 Mar 2022 11:35:06 +0100 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: Vote: yes On Wed, Mar 16, 2022 at 9:30 AM Tobias Hartmann wrote: > > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From aph-open at littlepinkcloud.com Wed Mar 16 12:02:54 2022 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Wed, 16 Mar 2022 12:02:54 +0000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <85983e45-c395-b4e7-a596-9ab2ab7ea1d2@littlepinkcloud.com> Vote: yes On 3/16/22 08:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph-open at littlepinkcloud.com Wed Mar 16 12:03:52 2022 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Wed, 16 Mar 2022 12:03:52 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <0ef0a8f3-b753-f2ad-b524-e4d9b03415b5@littlepinkcloud.com> Vote: yes On 3/16/22 08:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at openjdk.java.net Wed Mar 16 12:15:58 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 16 Mar 2022 12:15:58 GMT Subject: RFR: 8283257: x86: Clean up invocation/branch counter updates code Message-ID: I looked briefly at optimizing `InterpreterMacroAssembler::increment_mask_and_jump` a bit, but it looks that current code is the best we can do. This improvement does a few related cleanups without semantic changes. Additional testing: - [x] Linux x86_64 fastdebug `tier1` - [x] Eyeballing interpreter generated code ------------- Commit messages: - Cleanups Changes: https://git.openjdk.java.net/jdk/pull/7838/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7838&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283257 Stats: 22 lines in 4 files changed: 0 ins; 6 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/7838.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7838/head:pull/7838 PR: https://git.openjdk.java.net/jdk/pull/7838 From ChrisPhi at LGonQn.Org Wed Mar 16 12:28:33 2022 From: ChrisPhi at LGonQn.Org ("Chris Phillips"@T O) Date: Wed, 16 Mar 2022 08:28:33 -0400 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <2eeae7ff-8ca8-01eb-0c83-78ce6215077f@LGonQn.Org> Vote: yes Cheers! ChrisPhi From ChrisPhi at LGonQn.Org Wed Mar 16 12:30:37 2022 From: ChrisPhi at LGonQn.Org ("Chris Phillips"@T O) Date: Wed, 16 Mar 2022 08:30:37 -0400 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <9791401c-f11d-6609-6f5e-d64cbe2ee1e3@LGonQn.Org> Vote: yes Cheers! ChrisPhi From kim.barrett at oracle.com Wed Mar 16 12:40:01 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Mar 2022 12:40:01 +0000 Subject: CFV: New HotSpot Group Member: Sangheon Kim Message-ID: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> I hereby nominate Sangheon Kim to Membership in the HotSpot Group. Sangheon has been a JDK Reviewer and member of the Oracle GC team for many years, primarily working on G1. He has made many substantial contributions [1] including to NUMA support and improving GC thread configuration. Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From tobias.hartmann at oracle.com Wed Mar 16 12:43:05 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 13:43:05 +0100 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <3d3cbec9-efdb-2941-1734-0cc505f3a3c6@oracle.com> Vote: yes Best regards, Tobias On 16.03.22 13:40, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From kim.barrett at oracle.com Wed Mar 16 12:46:24 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Mar 2022 12:46:24 +0000 Subject: CFV: New HotSpot Group Member: Ivan Walulya Message-ID: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> hotspot-dev at openjdk.java.net CFV: New HotSpot Group Member: Ivan Walulya I hereby nominate Ivan Walulya to Membership in the HotSpot Group. Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working on G1. He has made many substantial contributions [1] including co-authoring a major rewrite of G1's remembered sets. He is also a frequent and thorough reviewer (as I well know). Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From kim.barrett at oracle.com Wed Mar 16 12:47:50 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Mar 2022 12:47:50 +0000 Subject: CFV: New HotSpot Group Member: Leo Korinth Message-ID: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> I hereby nominate Leo Korinth to Membership in the HotSpot Group. Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on G1. He has made many substantial contributions [1] including several refactorings in ParallelGC to bring it in-line with other collectors. He also dealt with the main removal of CMS and a number of related cleanups; CMS tendrils extended far and deep. Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From ChrisPhi at LGonQn.Org Wed Mar 16 12:47:49 2022 From: ChrisPhi at LGonQn.Org (Chris Phillips) Date: Wed, 16 Mar 2022 08:47:49 -0400 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <8d58bcdd-3356-57ce-f72b-5e44f30f77e8@LGonQn.Org> Vote: yes Cheers! ChrisPhi From ChrisPhi at LGonQn.Org Wed Mar 16 12:49:30 2022 From: ChrisPhi at LGonQn.Org (Chris Phillips) Date: Wed, 16 Mar 2022 08:49:30 -0400 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: <6d86108c-5ed0-b56d-a94a-011a7d4f5cb8@LGonQn.Org> Vote: yes Cheers! ChrisPhi From kim.barrett at oracle.com Wed Mar 16 12:49:48 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Mar 2022 12:49:48 +0000 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang Message-ID: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many substantial contributions [1] including co-authoring an improved GC thread controller for ZGC. He is a frequent and thorough reviewer, as well as being a dedicated code deletion engineer, finding many places to reduce complexity or remove dead code. Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From kim.barrett at oracle.com Wed Mar 16 12:51:54 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Mar 2022 12:51:54 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: vote: yes > On Mar 16, 2022, at 4:29 AM, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From kim.barrett at oracle.com Wed Mar 16 12:52:41 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Mar 2022 12:52:41 +0000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: vote: yes > On Mar 16, 2022, at 4:29 AM, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From ChrisPhi at LGonQn.Org Wed Mar 16 13:17:35 2022 From: ChrisPhi at LGonQn.Org ("Chris Phillips"@T O) Date: Wed, 16 Mar 2022 09:17:35 -0400 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <369586ee-9126-fbe6-3770-0973ad5c7d8e@LGonQn.Org> Vote: yes Cheers! ChrisPhi From jbhateja at openjdk.java.net Wed Mar 16 13:20:45 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 16 Mar 2022 13:20:45 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v2] In-Reply-To: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> References: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> Message-ID: On Wed, 16 Mar 2022 05:55:18 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch improves the generation of broadcasting a scalar in several ways: >> >> - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. >> - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. >> - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. >> >> This patch also removes some redundant code paths and rename some incorrectly named instructions. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > fix crash in sse Will be helpful if a JMH can be created around this, following is the except from X86 Optimizations manual Appendix E Section E.1.3 "Forwarding the result within the same bypass domain from a producer micro-op to a consumer micro is done efficiently in hardware without delay" ------------- PR: https://git.openjdk.java.net/jdk/pull/7832 From christian.hagedorn at oracle.com Wed Mar 16 13:44:44 2022 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 16 Mar 2022 14:44:44 +0100 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <22ace2ea-0f91-4e15-6292-0e3be5490104@oracle.com> Vote: yes Best regards, Christian On 16.03.22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From christian.hagedorn at oracle.com Wed Mar 16 13:45:17 2022 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 16 Mar 2022 14:45:17 +0100 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <497d3518-9521-fcd1-74a8-128223d2fc5b@oracle.com> Vote: yes Best regards, Christian On 16.03.22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From christian.hagedorn at oracle.com Wed Mar 16 13:46:36 2022 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 16 Mar 2022 14:46:36 +0100 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: Vote: yes Best regards, Christian On 16.03.22 13:40, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From goetz.lindenmaier at sap.com Wed Mar 16 13:54:47 2022 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 16 Mar 2022 13:54:47 +0000 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: Vote: yes Best Goetz. > -----Original Message----- > From: hotspot-dev On Behalf Of Kim > Barrett > Sent: Wednesday, March 16, 2022 1:48 PM > To: hotspot-dev at openjdk.java.net > Subject: CFV: New HotSpot Group Member: Leo Korinth > > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily > working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author- > name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&t > ype=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote From duke at openjdk.java.net Wed Mar 16 14:34:49 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Wed, 16 Mar 2022 14:34:49 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 09:37:29 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > rename, adding test lgtm ------------- Marked as reviewed by eastig at github.com (no known OpenJDK username). PR: https://git.openjdk.java.net/jdk/pull/7517 From duke at openjdk.java.net Wed Mar 16 14:39:49 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Wed, 16 Mar 2022 14:39:49 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 09:37:29 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > rename, adding test src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 58: > 56: // IC stub code size is not expected to vary depending on target address. > 57: // We use NOPs to make the ldr+far_jump+int64 size equal to ic_stub_code_size. > 58: for (int i = jump_code_size; i < ic_stub_code_size() - 12; i += 4) { 12 == 3 * NativeInstruction::instruction_size 4 == NativeInstruction::instruction_size ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From jesper.wilhelmsson at oracle.com Wed Mar 16 14:42:41 2022 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 16 Mar 2022 14:42:41 +0000 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <02B43391-0C61-48AD-9887-D8480664396E@oracle.com> Vote: Yes /Jesper > On 16 Mar 2022, at 13:40, Kim Barrett wrote: > > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From jesper.wilhelmsson at oracle.com Wed Mar 16 14:43:16 2022 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 16 Mar 2022 14:43:16 +0000 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <67EFD596-BB46-4E44-B6B5-D4DBC97DE71C@oracle.com> Vote: Yes /Jesper > On 16 Mar 2022, at 13:47, Kim Barrett wrote: > > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From thomas.stuefe at gmail.com Wed Mar 16 14:46:45 2022 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Mar 2022 15:46:45 +0100 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: Vote: yes On Wed, Mar 16, 2022 at 3:06 PM Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily > working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He > also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] > https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > > From duke at openjdk.java.net Wed Mar 16 14:55:48 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Wed, 16 Mar 2022 14:55:48 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v2] In-Reply-To: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> References: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> Message-ID: On Wed, 16 Mar 2022 05:55:18 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch improves the generation of broadcasting a scalar in several ways: >> >> - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. >> - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. >> - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. >> >> This patch also removes some redundant code paths and rename some incorrectly named instructions. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > fix crash in sse Hi, forwarding results within the same bypass domain does not result in delay, data bypass delay happens when the data crosses different domains, according to "Intel? 64 and IA-32 Architectures Optimization Reference Manual" > When a source of a micro-op executed in one stack comes from a micro-op executed in another stack, a delay can occur. The delay occurs also for transitions between Intel SSE integer and Intel SSE floating-point operations. In some of the cases, the data transition is done using a micro-op that is added to the instruction flow. The manual mentions the guideline at section 3.5.2.2 ![image](https://user-images.githubusercontent.com/49088128/158618209-c0674ba7-1c93-4014-a7e1-330f4e5846da.png) Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7832 From jesper.wilhelmsson at oracle.com Wed Mar 16 15:23:23 2022 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 16 Mar 2022 15:23:23 +0000 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: Vote: Yes /Jesper > On 16 Mar 2022, at 13:46, Kim Barrett wrote: > > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From jesper.wilhelmsson at oracle.com Wed Mar 16 15:23:42 2022 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 16 Mar 2022 15:23:42 +0000 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: <019A8C75-A71D-4BC4-9111-B811AA70AB39@oracle.com> Vote: Yes /Jesper > On 16 Mar 2022, at 13:49, Kim Barrett wrote: > > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From calvin.cheung at oracle.com Wed Mar 16 15:49:36 2022 From: calvin.cheung at oracle.com (calvin.cheung at oracle.com) Date: Wed, 16 Mar 2022 08:49:36 -0700 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: Vote: yes On 3/16/22 1:29 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. From calvin.cheung at oracle.com Wed Mar 16 15:50:03 2022 From: calvin.cheung at oracle.com (calvin.cheung at oracle.com) Date: Wed, 16 Mar 2022 08:50:03 -0700 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <191a683a-0901-ec7e-d88d-64bbc7f71259@oracle.com> Vote: yes On 3/16/22 1:29 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. From calvin.cheung at oracle.com Wed Mar 16 15:50:34 2022 From: calvin.cheung at oracle.com (calvin.cheung at oracle.com) Date: Wed, 16 Mar 2022 08:50:34 -0700 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <2dfb5b11-e2cc-e70b-1dcc-09867a0c7515@oracle.com> Vote: yes On 3/16/22 5:40 AM, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. From calvin.cheung at oracle.com Wed Mar 16 15:50:57 2022 From: calvin.cheung at oracle.com (calvin.cheung at oracle.com) Date: Wed, 16 Mar 2022 08:50:57 -0700 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: Vote: yes On 3/16/22 5:47 AM, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. From tobias.hartmann at oracle.com Wed Mar 16 15:50:58 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 16:50:58 +0100 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <7c61170f-8ea9-7ac8-af07-c9abcb3c1cbf@oracle.com> Vote: yes Best regards, Tobias On 16.03.22 13:47, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From tobias.hartmann at oracle.com Wed Mar 16 15:51:06 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 16:51:06 +0100 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: <1bde7caa-d7f9-186a-1c52-fc2a0383002a@oracle.com> Vote: yes Best regards, Tobias On 16.03.22 13:46, Kim Barrett wrote: > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From tobias.hartmann at oracle.com Wed Mar 16 15:51:20 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2022 16:51:20 +0100 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: <048407b7-0a82-0f5f-06ca-10f07d69f50f@oracle.com> Vote: yes Best regards, Tobias On 16.03.22 13:49, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From calvin.cheung at oracle.com Wed Mar 16 15:52:14 2022 From: calvin.cheung at oracle.com (calvin.cheung at oracle.com) Date: Wed, 16 Mar 2022 08:52:14 -0700 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: Vote: yes On 3/16/22 5:46 AM, Kim Barrett wrote: > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. From calvin.cheung at oracle.com Wed Mar 16 15:52:46 2022 From: calvin.cheung at oracle.com (calvin.cheung at oracle.com) Date: Wed, 16 Mar 2022 08:52:46 -0700 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: <1f357981-a4b1-da48-9e28-944de9c87c00@oracle.com> Vote: yes On 3/16/22 5:49 AM, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. From hohensee at amazon.com Wed Mar 16 15:54:08 2022 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 16 Mar 2022 15:54:08 +0000 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang Message-ID: Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Kim Barrett Date: Wednesday, March 16, 2022 at 5:51 AM To: "hotspot-dev at openjdk.java.net" Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many substantial contributions [1] including co-authoring an improved GC thread controller for ZGC. He is a frequent and thorough reviewer, as well as being a dedicated code deletion engineer, finding many places to reduce complexity or remove dead code. Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From hohensee at amazon.com Wed Mar 16 15:54:39 2022 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 16 Mar 2022 15:54:39 +0000 Subject: CFV: New HotSpot Group Member: Leo Korinth Message-ID: Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Kim Barrett Date: Wednesday, March 16, 2022 at 5:49 AM To: "hotspot-dev at openjdk.java.net" Subject: CFV: New HotSpot Group Member: Leo Korinth I hereby nominate Leo Korinth to Membership in the HotSpot Group. Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on G1. He has made many substantial contributions [1] including several refactorings in ParallelGC to bring it in-line with other collectors. He also dealt with the main removal of CMS and a number of related cleanups; CMS tendrils extended far and deep. Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From hohensee at amazon.com Wed Mar 16 15:55:12 2022 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 16 Mar 2022 15:55:12 +0000 Subject: CFV: New HotSpot Group Member: Ivan Walulya Message-ID: <750A3C1B-6C56-4955-B95C-0A9B8C2A5F75@amazon.com> Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Kim Barrett Date: Wednesday, March 16, 2022 at 5:47 AM To: "hotspot-dev at openjdk.java.net" Subject: CFV: New HotSpot Group Member: Ivan Walulya hotspot-dev at openjdk.java.net CFV: New HotSpot Group Member: Ivan Walulya I hereby nominate Ivan Walulya to Membership in the HotSpot Group. Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working on G1. He has made many substantial contributions [1] including co-authoring a major rewrite of G1's remembered sets. He is also a frequent and thorough reviewer (as I well know). Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From hohensee at amazon.com Wed Mar 16 15:58:14 2022 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 16 Mar 2022 15:58:14 +0000 Subject: CFV: New HotSpot Group Member: Dean Long Message-ID: <59AB1B83-59EC-462F-A2A4-F9A19216B017@amazon.com> Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Tobias Hartmann Date: Wednesday, March 16, 2022 at 1:31 AM To: hotspot-dev Source Developers Subject: CFV: New HotSpot Group Member: Dean Long Hi, I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that all incoming compiler bugs are properly handled. Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits [2] https://openjdk.java.net/census#hotspot [3] https://openjdk.java.net/groups/#member-vote From hohensee at amazon.com Wed Mar 16 15:58:38 2022 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 16 Mar 2022 15:58:38 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov Message-ID: Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Tobias Hartmann Date: Wednesday, March 16, 2022 at 1:30 AM To: hotspot-dev Source Developers Subject: CFV: New HotSpot Group Member: Vladimir Ivanov Hi, I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and caching. He is currently deeply involved in Project Panama, working on the Foreign Function Interface and the Vector API. Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits [2] https://openjdk.java.net/census#hotspot [3] https://openjdk.java.net/groups/#member-vote From jbhateja at openjdk.java.net Wed Mar 16 15:59:53 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 16 Mar 2022 15:59:53 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v2] In-Reply-To: References: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> Message-ID: On Wed, 16 Mar 2022 14:52:07 GMT, Quan Anh Mai wrote: > Hi, forwarding results within the same bypass domain does not result in delay, data bypass delay happens when the data crosses different domains, according to "Intel? 64 and IA-32 Architectures Optimization Reference Manual" > > > When a source of a micro-op executed in one stack comes from a micro-op executed in another stack, a delay can occur. The delay occurs also for transitions between Intel SSE integer and Intel SSE floating-point operations. In some of the cases, the data transition is done using a micro-op that is added to the instruction flow. > > The manual mentions the guideline at section 3.5.2.2 > > ![image](https://user-images.githubusercontent.com/49088128/158618209-c0674ba7-1c93-4014-a7e1-330f4e5846da.png) > > Thanks. Thanks meant to refer to above text. I have removed incorrect reference. ------------- PR: https://git.openjdk.java.net/jdk/pull/7832 From hohensee at amazon.com Wed Mar 16 15:55:40 2022 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 16 Mar 2022 15:55:40 +0000 Subject: CFV: New HotSpot Group Member: Sangheon Kim Message-ID: <9BD4C11C-7627-422B-93C2-FBAF7FFBACCA@amazon.com> Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Kim Barrett Date: Wednesday, March 16, 2022 at 5:40 AM To: "hotspot-dev at openjdk.java.net" Subject: CFV: New HotSpot Group Member: Sangheon Kim I hereby nominate Sangheon Kim to Membership in the HotSpot Group. Sangheon has been a JDK Reviewer and member of the Oracle GC team for many years, primarily working on G1. He has made many substantial contributions [1] including to NUMA support and improving GC thread configuration. Votes are due by Thursday, 31-March-2022 at 12h00 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list For Lazy Consensus voting instructions, see [3]. Kim Barrett [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From daniel.daugherty at oracle.com Wed Mar 16 16:27:11 2022 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 16 Mar 2022 12:27:11 -0400 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <34669759-58dd-f839-55fb-3e800281c3a1@oracle.com> Vote: yes On 3/16/22 4:29 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From daniel.daugherty at oracle.com Wed Mar 16 16:28:09 2022 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 16 Mar 2022 12:28:09 -0400 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <3a782e24-b313-f818-96c0-c0737c7149d5@oracle.com> Vote: yes Dan On 3/16/22 4:29 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From daniel.daugherty at oracle.com Wed Mar 16 16:29:34 2022 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 16 Mar 2022 12:29:34 -0400 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <182668d4-362b-fc56-1b28-b584fdf30325@oracle.com> Vote: yes Dan On 3/16/22 8:40 AM, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From daniel.daugherty at oracle.com Wed Mar 16 16:31:13 2022 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 16 Mar 2022 12:31:13 -0400 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <1fd78fd6-d292-410d-a849-28ab082ca66d@oracle.com> Vote: yes Dan On 3/16/22 8:47 AM, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From daniel.daugherty at oracle.com Wed Mar 16 16:32:07 2022 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 16 Mar 2022 12:32:07 -0400 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: Vote: yes Dan On 3/16/22 8:49 AM, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From jbhateja at openjdk.java.net Wed Mar 16 17:28:43 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 16 Mar 2022 17:28:43 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v2] In-Reply-To: References: <1FBk3MauXFxUsyHz9kuhqGI-CtLRgHYmHn1eyyaDLvs=.6d4d94b0-32a0-42dc-a181-87df8d8f3b65@github.com> Message-ID: On Wed, 16 Mar 2022 15:56:44 GMT, Jatin Bhateja wrote: > > Hi, forwarding results within the same bypass domain does not result in delay, data bypass delay happens when the data crosses different domains, according to "Intel? 64 and IA-32 Architectures Optimization Reference Manual" > > > When a source of a micro-op executed in one stack comes from a micro-op executed in another stack, a delay can occur. The delay occurs also for transitions between Intel SSE integer and Intel SSE floating-point operations. In some of the cases, the data transition is done using a micro-op that is added to the instruction flow. > > > > > > The manual mentions the guideline at section 3.5.2.2 > > ![image](https://user-images.githubusercontent.com/49088128/158618209-c0674ba7-1c93-4014-a7e1-330f4e5846da.png) > > Thanks. > > Thanks meant to refer to above text. I have removed incorrect reference. It will still be good if we can come up with a micro benchmark, that shows the gain with the patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/7832 From sspitsyn at openjdk.java.net Wed Mar 16 18:07:54 2022 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 16 Mar 2022 18:07:54 GMT Subject: RFR: 8282241: Invalid generic signature for redefined classes [v2] In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 17:12:51 GMT, Alex Menkov wrote: >> JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. >> We still have some incorrect code in CP merge during class redefinition. >> >> rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. >> >> In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. >> >> tested: >> - test/jdk/java/lang/instrument >> - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > Reworked the test Alex, The fix looks good. Thank you for taking care about it! Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7676 From dholmes at openjdk.java.net Thu Mar 17 01:58:44 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 17 Mar 2022 01:58:44 GMT Subject: RFR: 8283147: Include NonJavaThread stacktrace during thread dump [v3] In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 07:48:42 GMT, Yi Yang wrote: >> Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> 8283147: Include NonJavaThread stacktrace during thread dump > > Clarification on why there are some one-line frame: > > VMError::print_native_stack output > > "G1 Conc#7" os_prio=0 cpu=2.33ms elapsed=11.39s tid=0x00007f635c004d70 nid=71098 runnable > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libpthread.so.0+0xdb39] do_futex_wait.constprop.1+0x29 > > > pstack outpout > > Thread 40 (Thread 0x7f62b3bf7700 (LWP 71098)): > #0 0x00007f63d32a6b3b in do_futex_wait.constprop () from /lib64/libpthread.so.0 > #1 0x00007f63d32a6bcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 > #2 0x00007f63d32a6c6b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 > #3 0x00007f63d23f7c32 in PosixSemaphore::wait (this=this at entry=0x7f63cc077e78) at /home/qingfeng.yy/jdktip/src/hotspot/os/posix/semaphore_posix.cpp:65 > #4 0x00007f63d265b81b in Semaphore::wait (this=0x7f63cc077e78) at /home/qingfeng.yy/jdktip/src/hotspot/share/runtime/semaphore.hpp:55 > #5 WorkerTaskDispatcher::worker_run_task (this=0x7f63cc077e68) at /home/qingfeng.yy/jdktip/src/hotspot/share/gc/shared/workerThread.cpp:60 > #6 WorkerThread::run (this=0x7f635c004d70) at /home/qingfeng.yy/jdktip/src/hotspot/share/gc/shared/workerThread.cpp:163 > #7 0x00007f63d25aa790 in Thread::call_run (this=this at entry=0x7f635c004d70) at /home/qingfeng.yy/jdktip/src/hotspot/share/runtime/thread.cpp:357 > #8 0x00007f63d2348a28 in thread_native_entry (thread=0x7f635c004d70) at /home/qingfeng.yy/jdktip/src/hotspot/os/linux/os_linux.cpp:706 > #9 0x00007f63d32a0ea5 in start_thread () from /lib64/libpthread.so.0 > #10 0x00007f63d2dc58dd in clone () from /lib64/libc.so.6 > > > The top frame is as follows: > > C frame (sp=0x00007f6338ca0d90 unextended sp=0x00007f6338ca0d90, fp=0x00007f63cc0667c8, real_fp=0x00007f63cc0667c8, pc=0x00007f63d32a6b39 link=0x0000000900000000) > > do_futex_wait.constprop don't have a valid link/last_frame_pointer, because libpthread has some novel assembly code: > > > 000000000000db10 : > db10: 55 push %rbp > db11: 48 89 fd mov %rdi,%rbp > db14: 53 push %rbx > db15: 48 83 ec 08 sub $0x8,%rsp > db19: 8b 5f 08 mov 0x8(%rdi),%ebx > db1c: e8 1f 09 00 00 callq e440 <__pthread_enable_asynccancel> > db21: 45 31 d2 xor %r10d,%r10d > db24: 41 89 c0 mov %eax,%r8d > db27: 31 d2 xor %edx,%edx > db29: 89 de mov %ebx,%esi > db2b: bb ca 00 00 00 mov $0xca,%ebx > db30: 48 89 ef mov %rbp,%rdi > db33: 40 80 f6 80 xor $0x80,%sil > db37: 89 d8 mov %ebx,%eax > db39: 0f 05 syscall > db3b: 89 c3 mov %eax,%ebx > .... > 000000000000db80 <__new_sem_wait_slow.constprop.0>: > db80: 41 54 push %r12 > db82: 48 b8 00 00 00 00 01 movabs $0x100000000,%rax > db89: 00 00 00 > db8c: 55 push %rbp > db8d: 53 push %rbx > db8e: 48 89 fb mov %rdi,%rbx > db91: 48 83 ec 30 sub $0x30,%rsp > db95: f0 48 0f c1 07 lock xadd %rax,(%rdi) > db9a: 48 8d 35 5f ff ff ff lea -0xa1(%rip),%rsi # db00 <__sem_wait_cleanup> > dba1: 49 bc ff ff ff ff fe movabs $0xfffffffeffffffff,%r12 > dba8: ff ff ff > dbab: 48 8d 6c 24 10 lea 0x10(%rsp),%rbp > dbb0: 48 89 fa mov %rdi,%rdx > dbb3: 48 89 04 24 mov %rax,(%rsp) > dbb7: 48 89 ef mov %rbp,%rdi > dbba: e8 b1 04 00 00 callq e070 <_pthread_cleanup_push> > dbbf: 48 8b 04 24 mov (%rsp),%rax > dbc3: 85 c0 test %eax,%eax > .... > > So os::is_first_C_frame returns earlier. To support walking pthread library, I don't think it requires huge efforts, though. @kelthuzadx I do not agree with this enhancement request. I don't think it is the job of jcmd/jstack to do this. Those tools are concerned with application introspection not VM debugging. ------------- PR: https://git.openjdk.java.net/jdk/pull/7833 From david.holmes at oracle.com Thu Mar 17 02:34:15 2022 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Mar 2022 12:34:15 +1000 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: Vote: yes Thanks, David On 16/03/2022 10:49 pm, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From david.holmes at oracle.com Thu Mar 17 02:34:43 2022 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Mar 2022 12:34:43 +1000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <3733df0e-b136-bb3b-292b-5ce829784804@oracle.com> Vote: yes Thanks, David On 16/03/2022 6:29 pm, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From david.holmes at oracle.com Thu Mar 17 02:35:05 2022 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Mar 2022 12:35:05 +1000 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: Vote: yes Thanks, David On 16/03/2022 10:46 pm, Kim Barrett wrote: > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From david.holmes at oracle.com Thu Mar 17 02:35:31 2022 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Mar 2022 12:35:31 +1000 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: Vote: yes Thanks, David On 16/03/2022 10:47 pm, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From david.holmes at oracle.com Thu Mar 17 02:35:57 2022 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Mar 2022 12:35:57 +1000 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <5af2a8aa-28cd-eff1-a7e2-bf6b804f6100@oracle.com> Vote: yes Thanks, David On 16/03/2022 10:40 pm, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From david.holmes at oracle.com Thu Mar 17 02:36:40 2022 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Mar 2022 12:36:40 +1000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <30f9899e-4348-2ebc-fe04-b97453226ec6@oracle.com> Vote: yes Thanks, David On 16/03/2022 6:29 pm, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From mikael.vidstedt at oracle.com Thu Mar 17 05:28:03 2022 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 17 Mar 2022 05:28:03 +0000 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: Vote: yes Cheers, Mikael > On Mar 16, 2022, at 5:49 AM, Kim Barrett wrote: > > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From mikael.vidstedt at oracle.com Thu Mar 17 05:28:55 2022 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 17 Mar 2022 05:28:55 +0000 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <77919E44-A1F1-4FE0-AE7B-8A62A06D3036@oracle.com> Vote: yes Cheers, Mikael > On Mar 16, 2022, at 1:29 AM, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From mikael.vidstedt at oracle.com Thu Mar 17 05:29:54 2022 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 17 Mar 2022 05:29:54 +0000 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: Vote: yes Cheers, Mikael > On Mar 16, 2022, at 5:46 AM, Kim Barrett wrote: > > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From mikael.vidstedt at oracle.com Thu Mar 17 05:30:21 2022 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 17 Mar 2022 05:30:21 +0000 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <1F9B9B3A-5CC2-4ED9-8A42-7D19ACEEDD03@oracle.com> Vote: yes Cheers, Mikael > On Mar 16, 2022, at 5:47 AM, Kim Barrett wrote: > > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From mikael.vidstedt at oracle.com Thu Mar 17 05:30:39 2022 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 17 Mar 2022 05:30:39 +0000 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <58FB0523-919B-4F39-B52D-D3187320377A@oracle.com> Vote: yes Cheers, Mikael > On Mar 16, 2022, at 5:40 AM, Kim Barrett wrote: > > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From mikael.vidstedt at oracle.com Thu Mar 17 05:31:15 2022 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 17 Mar 2022 05:31:15 +0000 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <3C9BF102-B99F-4E2A-BEDE-F06990C56B9A@oracle.com> Vote: yes Cheers, Mikael > On Mar 16, 2022, at 1:29 AM, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From dholmes at openjdk.java.net Thu Mar 17 06:36:48 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 17 Mar 2022 06:36:48 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes In-Reply-To: References: Message-ID: <7gcGGa7xxIW3W6ncNPZ4rCb_EO9n_QxvITCnBxyvTJE=.ad5b5614-d8dd-4539-bd14-edd78eb209fa@github.com> On Fri, 11 Mar 2022 20:48:06 GMT, Doug Simon wrote: > [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) added abstract assembly to hs-err for methods on the stack of the crashing thread. However, it only does this if the crash is due to an unhandled signal. It can also be useful to see assembly for crashes due to failing VM assertions or guarantees. This PR implements this improvement. Hi Doug, I'm approving in principle but would like to see the code tweaked as per the comment below. Thanks, David src/hotspot/share/utilities/vmError.cpp line 915: > 913: if (_verbose) { > 914: frame fr = _context ? os::fetch_frame_from_context(_context) > 915: : os::current_frame(); This seems like it is only need in the else branch at L931, so I'd prefer to see it there please. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7791 From erik.osterlund at oracle.com Thu Mar 17 07:11:08 2022 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Thu, 17 Mar 2022 07:11:08 +0000 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <6C082596-24F9-4AD4-930E-CE92276721F4@oracle.com> Vote: yes /Erik > On 16 Mar 2022, at 13:48, Kim Barrett wrote: > > ?I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From per.liden at oracle.com Thu Mar 17 07:27:17 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 17 Mar 2022 08:27:17 +0100 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: Vote: yes /Per On 3/16/22 13:49, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From thomas.schatzl at oracle.com Thu Mar 17 08:15:57 2022 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Mar 2022 09:15:57 +0100 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: Vote: yes On 16.03.22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From thomas.schatzl at oracle.com Thu Mar 17 08:17:02 2022 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Mar 2022 09:17:02 +0100 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: <98c883bc-989a-8857-26ac-6d3cf5bcfb77@oracle.com> Vote: yes On 16.03.22 13:49, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From thomas.schatzl at oracle.com Thu Mar 17 08:17:15 2022 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Mar 2022 09:17:15 +0100 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: Vote: yes On 16.03.22 13:47, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From thomas.schatzl at oracle.com Thu Mar 17 08:17:55 2022 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Mar 2022 09:17:55 +0100 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: Vote: yes On 16.03.22 13:40, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From thomas.schatzl at oracle.com Thu Mar 17 08:18:09 2022 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Mar 2022 09:18:09 +0100 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: <828ac0f1-177c-f235-3016-5c09a62ae0d7@oracle.com> Vote: yes On 16.03.22 13:46, Kim Barrett wrote: > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From thomas.schatzl at oracle.com Thu Mar 17 08:18:39 2022 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Mar 2022 09:18:39 +0100 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <465e1ff4-a5b1-71b3-2326-96b557aa048c@oracle.com> Vote: yes On 16.03.22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From per.liden at oracle.com Thu Mar 17 08:31:19 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 17 Mar 2022 09:31:19 +0100 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: <27c5199c-b0a7-fbcf-8aba-4f5196345758@oracle.com> Vote: yes /Per On 3/16/22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From per.liden at oracle.com Thu Mar 17 08:31:58 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 17 Mar 2022 09:31:58 +0100 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: Vote: yes /Per On 3/16/22 13:46, Kim Barrett wrote: > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From per.liden at oracle.com Thu Mar 17 08:32:32 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 17 Mar 2022 09:32:32 +0100 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: Vote: yes /Per On 3/16/22 13:40, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From per.liden at oracle.com Thu Mar 17 08:32:50 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 17 Mar 2022 09:32:50 +0100 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <39646235-2290-f467-146e-88c764a82fc3@oracle.com> Vote: yes /Per On 3/16/22 13:47, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From per.liden at oracle.com Thu Mar 17 08:33:06 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 17 Mar 2022 09:33:06 +0100 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <284d439a-46ff-b5a0-4de3-95c84615b318@oracle.com> Vote: yes /Per On 3/16/22 09:29, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From jiefu at openjdk.java.net Thu Mar 17 08:36:03 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Thu, 17 Mar 2022 08:36:03 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag Message-ID: Hi all, As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. Please review it. Thanks. Best regards, Jie ------------- Commit messages: - 8283298: Make CodeCacheSegmentSize a product flag Changes: https://git.openjdk.java.net/jdk/pull/7851/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7851&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283298 Stats: 10 lines in 4 files changed: 4 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7851.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7851/head:pull/7851 PR: https://git.openjdk.java.net/jdk/pull/7851 From redestad at openjdk.java.net Thu Mar 17 09:13:23 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 17 Mar 2022 09:13:23 GMT Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives [v16] In-Reply-To: References: Message-ID: <1pPTN2fxiRObl90zDb2ObrnuKswJ7Z42TUox2-XDhSY=.e2c705ca-f057-45a0-b687-fc473bd28305@github.com> > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Disallow negative values in TestCountPositives test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7231/files - new: https://git.openjdk.java.net/jdk/pull/7231/files/bc5a8c80..6f22e1aa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=15 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=14-15 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231 PR: https://git.openjdk.java.net/jdk/pull/7231 From redestad at openjdk.java.net Thu Mar 17 09:24:38 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 17 Mar 2022 09:24:38 GMT Subject: Integrated: 8281146: Replace StringCoding.hasNegatives with countPositives In-Reply-To: References: Message-ID: On Wed, 26 Jan 2022 12:51:31 GMT, Claes Redestad wrote: > I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input. > > Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 > > - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups. > > - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field. > > - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix). This pull request has now been integrated. Changeset: beedae11 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/beedae1141b6b650dc4cedf1f038afc1c8b460dd Stats: 619 lines in 36 files changed: 278 ins; 61 del; 280 mod 8281146: Replace StringCoding.hasNegatives with countPositives Co-authored-by: Lutz Schmidt Co-authored-by: Martin Doerr Reviewed-by: kvn, lucy, rriggs ------------- PR: https://git.openjdk.java.net/jdk/pull/7231 From dnsimon at openjdk.java.net Thu Mar 17 10:30:19 2022 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 17 Mar 2022 10:30:19 GMT Subject: RFR: 8283056: show abstract machine code for all VM crashes [v2] In-Reply-To: References: Message-ID: > [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) added abstract assembly to hs-err for methods on the stack of the crashing thread. However, it only does this if the crash is due to an unhandled signal. It can also be useful to see assembly for crashes due to failing VM assertions or guarantees. This PR implements this improvement. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: moved fr declaration to block where it is used ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7791/files - new: https://git.openjdk.java.net/jdk/pull/7791/files/3ef8637d..a1adb8aa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7791&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7791&range=00-01 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7791.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7791/head:pull/7791 PR: https://git.openjdk.java.net/jdk/pull/7791 From dnsimon at openjdk.java.net Thu Mar 17 10:33:33 2022 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 17 Mar 2022 10:33:33 GMT Subject: RFR: 8283056: show abstract machine code in hs-err for all VM crashes [v2] In-Reply-To: <7gcGGa7xxIW3W6ncNPZ4rCb_EO9n_QxvITCnBxyvTJE=.ad5b5614-d8dd-4539-bd14-edd78eb209fa@github.com> References: <7gcGGa7xxIW3W6ncNPZ4rCb_EO9n_QxvITCnBxyvTJE=.ad5b5614-d8dd-4539-bd14-edd78eb209fa@github.com> Message-ID: On Thu, 17 Mar 2022 06:33:44 GMT, David Holmes wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> moved fr declaration to block where it is used > > Hi Doug, > > I'm approving in principle but would like to see the code tweaked as per the comment below. > > Thanks, > David Thanks @dholmes-ora , I've made the suggested change: a1adb8aae99312b2a2b86fa4a6c57a8493484e4b ------------- PR: https://git.openjdk.java.net/jdk/pull/7791 From duke at openjdk.java.net Thu Mar 17 12:05:18 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 17 Mar 2022 12:05:18 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v3] In-Reply-To: References: Message-ID: <_BenQBPyAIy6sOba4OwJLA-XusVv9-QWa6uH867eNRs=.75b35402-e739-4133-a091-18755a5ed8c4@github.com> > Hi, > > This patch improves the generation of broadcasting a scalar in several ways: > > - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. > - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. > - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. > > This patch also removes some redundant code paths and rename some incorrectly named instructions. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: fix rematerialize, constant deduplication ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7832/files - new: https://git.openjdk.java.net/jdk/pull/7832/files/8216d790..3bc7731a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=01-02 Stats: 27 lines in 3 files changed: 23 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From tanksherman27 at gmail.com Thu Mar 17 12:19:08 2022 From: tanksherman27 at gmail.com (Julian Waters) Date: Thu, 17 Mar 2022 20:19:08 +0800 Subject: Review of JEP draft Message-ID: Hi everyone, If you don't mind a little reading, can I get a review of the following JEP draft at https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC? Apologies if this is not the proper way to submit a JEP, I'm a little new to this. Have a great day! best regards, Julian From james.laskey at oracle.com Thu Mar 17 12:38:32 2022 From: james.laskey at oracle.com (Jim Laskey) Date: Thu, 17 Mar 2022 12:38:32 +0000 Subject: Review of JEP draft In-Reply-To: References: Message-ID: <460B817B-8C22-414F-AD07-1489C988A2BE@oracle.com> The question mark at the end of you link is messing things up. Try: https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC On Mar 17, 2022, at 9:19 AM, Julian Waters > wrote: Hi everyone, If you don't mind a little reading, can I get a review of the following JEP draft at https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC? Apologies if this is not the proper way to submit a JEP, I'm a little new to this. Have a great day! best regards, Julian From duke at openjdk.java.net Thu Mar 17 12:51:32 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 17 Mar 2022 12:51:32 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v3] In-Reply-To: <_BenQBPyAIy6sOba4OwJLA-XusVv9-QWa6uH867eNRs=.75b35402-e739-4133-a091-18755a5ed8c4@github.com> References: <_BenQBPyAIy6sOba4OwJLA-XusVv9-QWa6uH867eNRs=.75b35402-e739-4133-a091-18755a5ed8c4@github.com> Message-ID: On Thu, 17 Mar 2022 12:05:18 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch improves the generation of broadcasting a scalar in several ways: >> >> - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. >> - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. >> - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. >> >> This patch also removes some redundant code paths and rename some incorrectly named instructions. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > fix rematerialize, constant deduplication Doing a simple benchmark that has a lot of register pressure @Benchmark public long broadcastCon() { var species = IntVector.SPECIES_PREFERRED; var sum = IntVector.zero(species); return sum.add(1).add(2).add(3).add(4).add(5).add(6).add(7).add(8) .add(9).add(10).add(11).add(12).add(13).add(14).add(15).add(16) .add(17).add(18).add(19).add(20).add(21).add(22).add(23).add(24) .add(25).add(26).add(27).add(28).add(29).add(30).add(31).add(32) .add(1).add(2).add(3).add(4).add(5).add(6).add(7).add(8) .add(9).add(10).add(11).add(12).add(13).add(14).add(15).add(16) .add(17).add(18).add(19).add(20).add(21).add(22).add(23).add(24) .add(25).add(26).add(27).add(28).add(29).add(30).add(31).add(32) .reinterpretAsLongs() .lane(0); } provides the following result: Before: Benchmark Mode Cnt Score Error Units VectorReplicate.broadcastCon avgt 5 16.417 ? 0.515 ns/op After: Benchmark Mode Cnt Score Error Units VectorReplicate.broadcastCon avgt 5 13.851 ? 0.154 ns/op The constant table size decreases from 1024 bytes to 128 bytes, which is much more manageable. The throughput improvement mostly comes from the vector being rematerialized instead of being spilt on the stack. I have not been able to observe performance gain regarding bypass delay, which is expected as according to "Agner's optimisation manual on the micro architecture of Intel, AMD and VIA CPUs", Intel CPUs since Skylake seem to have only a few such delays. Thank you very much. ------------- PR: https://git.openjdk.java.net/jdk/pull/7832 From dnsimon at openjdk.java.net Thu Mar 17 12:54:32 2022 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 17 Mar 2022 12:54:32 GMT Subject: Integrated: 8283056: show abstract machine code in hs-err for all VM crashes In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 20:48:06 GMT, Doug Simon wrote: > [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) added abstract assembly to hs-err for methods on the stack of the crashing thread. However, it only does this if the crash is due to an unhandled signal. It can also be useful to see assembly for crashes due to failing VM assertions or guarantees. This PR implements this improvement. This pull request has now been integrated. Changeset: 69e4e338 Author: Doug Simon URL: https://git.openjdk.java.net/jdk/commit/69e4e338b19c0ffd2f0881be1bbb19a5642bc4d4 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod 8283056: show abstract machine code in hs-err for all VM crashes Reviewed-by: thartmann, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7791 From aph at openjdk.java.net Thu Mar 17 13:52:39 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 17 Mar 2022 13:52:39 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v17] In-Reply-To: References: Message-ID: On Sun, 13 Mar 2022 06:36:15 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Windows build failure fix. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4143: > 4141: ldmxcsr(new_mxcsr); > 4142: // Move raw bits corresponding to double value 0.5 into scratch register. > 4143: mov64(scratch, 4602678819172646912L); Suggestion: mov64(scratch, julong_cast(0.5)); ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From david.lloyd at redhat.com Thu Mar 17 13:57:03 2022 From: david.lloyd at redhat.com (David Lloyd) Date: Thu, 17 Mar 2022 08:57:03 -0500 Subject: Review of JEP draft In-Reply-To: References: Message-ID: The proposal seems to imply that the only alternative would be to pad all existing opcodes. But at the same time, it indicates that having the `ExtendedOpcodes` attribute be present - on a per-Code-attribute basis - would require not only the extra bytes for the attribute but that the code attribute itself must be padded. What about alternatives such as reserving one (or more) bytecode(s) to mean "extended instruction"? This would be similar to how `WIDE` modifies the next bytecode, and would immediately make another 256 bytecodes available without impacting the size of any other bytecode in the method, nor requiring any extra attributes. This idea could be extended as far as one is willing to add prefix bytecodes. For example one could reserve 16 aligned and consecutive bytecodes for this purpose, creating a 12 bit space for up to 4096 more bytecodes with a minimum of impact on space or on producers or consumers of bytecodes, at the cost of 16 "small" bytecodes. While there might (or might not) be good reasons to not pursue this approach, if you're talking about extending the bytecode space in a JEP, you should probably at least explain why such a scheme should not be considered or else list it under the "Alternatives" heading. On Thu, Mar 17, 2022 at 7:19 AM Julian Waters wrote: > Hi everyone, > > If you don't mind a little reading, can I get a review of the following JEP > draft at > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC > ? > Apologies if this is not the proper way to submit a JEP, I'm a little new > to this. > > Have a great day! > > best regards, > Julian > > -- - DML ? he/him From tanksherman27 at gmail.com Thu Mar 17 14:16:21 2022 From: tanksherman27 at gmail.com (Julian Waters) Date: Thu, 17 Mar 2022 22:16:21 +0800 Subject: Review of JEP draft In-Reply-To: References: Message-ID: The idea of adding opcodes specifically meaning the one after it should be extended had not occurred to me while writing the JEP, I'll update the proposal in a while to reflect this possibility. I actually think this might be a better idea than the one I came up with. best regards, Julian On Thu, Mar 17, 2022 at 9:57 PM David Lloyd wrote: > The proposal seems to imply that the only alternative would be to pad all > existing opcodes. But at the same time, it indicates that having the > `ExtendedOpcodes` attribute be present - on a per-Code-attribute basis - > would require not only the extra bytes for the attribute but that the code > attribute itself must be padded. > > What about alternatives such as reserving one (or more) bytecode(s) to > mean "extended instruction"? This would be similar to how `WIDE` modifies > the next bytecode, and would immediately make another 256 bytecodes > available without impacting the size of any other bytecode in the method, > nor requiring any extra attributes. > > This idea could be extended as far as one is willing to add prefix > bytecodes. For example one could reserve 16 aligned and consecutive > bytecodes for this purpose, creating a 12 bit space for up to 4096 more > bytecodes with a minimum of impact on space or on producers or consumers of > bytecodes, at the cost of 16 "small" bytecodes. > > While there might (or might not) be good reasons to not pursue this > approach, if you're talking about extending the bytecode space in a JEP, > you should probably at least explain why such a scheme should not be > considered or else list it under the "Alternatives" heading. > > > On Thu, Mar 17, 2022 at 7:19 AM Julian Waters > wrote: > >> Hi everyone, >> >> If you don't mind a little reading, can I get a review of the following >> JEP >> draft at >> >> https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC >> ? >> Apologies if this is not the proper way to submit a JEP, I'm a little new >> to this. >> >> Have a great day! >> >> best regards, >> Julian >> >> > > -- > - DML ? he/him > From duke at openjdk.java.net Thu Mar 17 15:34:19 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 17 Mar 2022 15:34:19 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v4] In-Reply-To: References: Message-ID: > Hi, > > This patch improves the generation of broadcasting a scalar in several ways: > > - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. > - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. > - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. > > This patch also removes some redundant code paths and rename some incorrectly named instructions. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: fix comparison ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7832/files - new: https://git.openjdk.java.net/jdk/pull/7832/files/3bc7731a..3dbc7432 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From duke at openjdk.java.net Thu Mar 17 22:39:06 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 17 Mar 2022 22:39:06 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v5] In-Reply-To: References: Message-ID: <6BMjbZE-S8duO5F6mWU0tpPo6g4OV2QBSNZ5IuxTaP8=.c7c76db6-e84a-48a8-9e82-d1c3b97a4bd6@github.com> > Hi, > > This patch improves the generation of broadcasting a scalar in several ways: > > - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. > - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. > - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. > > This patch also removes some redundant code paths and rename some incorrectly named instructions. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: rematerializing input count ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7832/files - new: https://git.openjdk.java.net/jdk/pull/7832/files/3dbc7432..bb494bc2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=03-04 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From duke at openjdk.java.net Thu Mar 17 23:08:13 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 17 Mar 2022 23:08:13 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v6] In-Reply-To: References: Message-ID: > Hi, > > This patch improves the generation of broadcasting a scalar in several ways: > > - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. > - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. > - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. > > This patch also removes some redundant code paths and rename some incorrectly named instructions. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: unsignness ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7832/files - new: https://git.openjdk.java.net/jdk/pull/7832/files/bb494bc2..2b1c1da4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From david.holmes at oracle.com Thu Mar 17 23:29:15 2022 From: david.holmes at oracle.com (David Holmes) Date: Fri, 18 Mar 2022 09:29:15 +1000 Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6] In-Reply-To: <9Pfhr7V3j4Op4px61CEhpa4jVwueR1wQmjLaS8l8x2g=.6404217f-0556-4f9a-b81b-d8642bb73a13@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <2VVnQ4RiNCtAuWXQ_d-vgj-8uejqKTdAWXwxKJUNix4=.6d88041c-2332-452d-9e70-b9429940d1f0@github.com> <9Pfhr7V3j4Op4px61CEhpa4jVwueR1wQmjLaS8l8x2g=.6404217f-0556-4f9a-b81b-d8642bb73a13@github.com> Message-ID: <883a1dec-f805-0873-4e68-718cb4cf1f38@oracle.com> On 12/03/2022 2:37 am, Anton Kozlov wrote: > On Thu, 10 Mar 2022 18:04:50 GMT, Thomas Stuefe wrote: > >> blocking SIGSEGV and SIGBUS - or other synchronous error signals like SIGFPE - and then triggering said signal is UB. What happens is OS-dependent. I saw processes vanishing, or hang, or core. It makes sense, since what is the kernel supposed to do. It cannot deliver the signal, and deferring it would require returning to the faulting instruction, that would just re-fault. >> For some more details see e.g. https://bugs.openjdk.java.net/browse/JDK-8252533 > > This UB looks reasonable. My point is that a native thread would run fine with SIGSEGV blocked. But then JVM decides it can do SafeFetch, and things gets nasty. > >>> Is there a crash that is fixed by the change? I just spotted it is an enhancement, not a bug. Just trying to understand the problem. >> >> Yes, this issue is a breakout from https://bugs.openjdk.java.net/browse/JDK-8282306, where we'd like to use SafeFetch to make stack walking in AsyncGetCallTrace more robust. AGCT is called from the signal handler, and it may run in any number of situations (e.g. in foreign threads, or threads which are in the process of getting dismantled, etc). > > I mean, some way to verify the issue is fixed, e.g. a test that does not fail anymore. > > I see AsyncGetCallTrace to assume the JavaThread very soon, or do I look at the wrong place? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/forte.cpp#L569 It is up to the agent setting things up for AGCT to only actually call it for JavaThreads. David ----- >> Another situation is error handling itself. When writing an hs-err file, we use SafeFetch to do carefully tiptoe around the possibly corrupt VM state. If the original crash happened in a foreign thread, we still want some of these reports to work (e.g. dumping register content or printing stacks). So SafeFetch should be as robust as possible. > > OK, thanks. I think we also handle recursive segfaults recover after interpretation of the corrupted VM state. Otherwise, implementing the printing functions would be too tedious and hard with SafeFetch alone. But I see it's used in printing register content, at least. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7727 From duke at openjdk.java.net Fri Mar 18 00:29:07 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Fri, 18 Mar 2022 00:29:07 GMT Subject: RFR: 8283232: x86: Improve vector broadcast operations [v7] In-Reply-To: References: Message-ID: > Hi, > > This patch improves the generation of broadcasting a scalar in several ways: > > - Avoid potential data bypass delay which can be observed on some platforms by using the correct type of instruction if it does not require extra instructions. > - As it has been pointed out, dumping the whole vector into the constant table is costly in terms of code size, this patch minimises this overhead for vector replicate of constants. Also, options are available for constants to be generated with more alignment so that vector load can be made efficiently without crossing cache lines. > - Vector broadcasting should prefer rematerialising to spilling when register pressure is high. > > This patch also removes some redundant code paths and rename some incorrectly named instructions. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: remove duplicate ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7832/files - new: https://git.openjdk.java.net/jdk/pull/7832/files/2b1c1da4..63d84bd5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7832&range=05-06 Stats: 49 lines in 2 files changed: 5 ins; 40 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7832.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7832/head:pull/7832 PR: https://git.openjdk.java.net/jdk/pull/7832 From david.holmes at oracle.com Fri Mar 18 00:30:34 2022 From: david.holmes at oracle.com (David Holmes) Date: Fri, 18 Mar 2022 10:30:34 +1000 Subject: Review of JEP draft In-Reply-To: References: Message-ID: Hi Julian, On 17/03/2022 10:19 pm, Julian Waters wrote: > Hi everyone, > > If you don't mind a little reading, can I get a review of the following JEP > draft at > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC? > Apologies if this is not the proper way to submit a JEP, I'm a little new > to this. The bar is set very, very high, for introducing new bytecodes and consequently running out of them has not been a problem in practice. The reason the bar has been set so high is because the impact of a new bytecode on the whole Java ecosystem is enormous. Numerous new features have considered the possibility of adding a new bytecode, but very few have actually done so, instead flexible mechanisms like invokeDynamic, were introduced, that could then be used to implement a range of other features. If we had almost no spare bytecodes left, and we regularly added new bytecodes, then this would be a problem that needs solving. But as it stands I don't see a real problem that needs solving here. YMMV. Cheers, David > Have a great day! > > best regards, > Julian From duke at openjdk.java.net Fri Mar 18 07:21:36 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Fri, 18 Mar 2022 07:21:36 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove two unnecessary lines This is not the point: It comes down to API design. If we use SafeFetch in os::is_first_C_frame (and thereby in frame::link_or_null) and not just in ASGCT, then it depends on when the other methods can be called. These methods are e.g. used whenever an error happens and a hs_err file is generated. We cannot guarantee that a JavaThread is always present there. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From johannes.bechberger at sap.com Fri Mar 18 09:43:58 2022 From: johannes.bechberger at sap.com (Bechberger, Johannes) Date: Fri, 18 Mar 2022 09:43:58 +0000 Subject: Proposal of a new version of AsyncGetCallTrace Message-ID: Hi, I would like propose to 1. Replace duplicated stack walking code with unified API 2. Create a new version of AsyncGetCallTrace, tentatively called "AsyncGetCallTrace2", with more information on more frames using the unified API A demo (as well as this text) is available at https://github.com/parttimenerd/asgct2-demo if you want to see a prototype of this proposal in action. Unify Stack Walking ================ There are currently multiple implementations of stack walking in JFR and for AsyncGetCallTrace. They each implement their own extension of vframeStream but with comparable features and check for problematic frames. My proposal is, therefore, to replace the stack walking code with a unified API that includes all error checking and vframeStream extensions in a single place. The prosposed new class is called StackWalker and could be part of `jfr/recorder/stacktrace` [1]. This class also supports getting information on C frames so it can be potentially used for walking stacks in VMError (used to create hs_err files), further reducing the amount of different stack walking code. AsyncGetCallTrace2 ================ The AsyncGetCallTrace call has seen increasing use in recent years in profilers like async-profiler. But it is not really an API (not exported in any header) and the information on frames it returns is pretty limited (only the method and bci for Java frames) which makes implementing profilers and other tooling harder. Tools like async-profiler have to resort to complicated code to partially obtain the information that the JVM already has. Information that is currently hidden and impossible to obtain is - whether a compiled frame is inlined (currently only obtainable for the topmost compiled frames) - although this can be obtained using JFR - C frames that are not at the top of the stack - compilation level (C1 or C2 compiled) This information is helpful when profiling and tuning the VM for a given application and also for profiling code that uses JNI heavily. Using the proposed StackWalker class, implementing a new API that returns more information on frames is possible as a thin wrapper over the StackWalker API [2]. This also improves the maintainability as the code used in this API is used in multiple places and is therefore also better tested than the previous implementation, see [1] for the implementation. The following describes the proposed API: ```cpp void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext); ``` The structure of `CallTrace` is the same as the original `ASGCT_CallTrace` with the same error codes encoded in <= 0 values of `num_frames`. ```cpp typedef struct { JNIEnv *env_id; // Env where trace was recorded jint num_frames; // number of frames in this trace CallFrame *frames; // frames void* frame_info; // more information on frames } CallTrace; ``` The only difference is that the `frames` array also contains information on C frames and the field `frame_info`. The `frame_info` is currently null and can later be used for extended information on each frame, being an array with an element for each frame. But the type of the elements in this array is implementation specific. This akin to `compile_info` field in JVMTI's CompiledMethodLoad [3] and used for extending the information returned by the API later. Protoype ------------ Currently `CallFrame` is implemented in the prototype [4] as ```cpp typedef struct { void *machine_pc; // program counter, for C and native frames (frames of native methods) uint8_t type; // frame type (single byte) uint8_t comp_level; // highest compilation level of a method related to a Java frame // information from original CallFrame jint bci; // bci for Java frames jmethodID method_id; // method ID for Java frames } CallFrame; ``` The `FrameTypeId` is based on the frame type in JFRStackFrame: ```cpp enum FrameTypeId { FRAME_INTERPRETED = 0, FRAME_JIT = 1, // JIT compiled FRAME_INLINE = 2, // inlined JITed methods FRAME_NATIVE = 3, // native wrapper to call C methods from Java FRAME_CPP = 4 // c/c++/... frames, stub frames have CompLevel_all }; ``` The `comp_level` states the compilation level of the method related to the frame with higher numbers representing "more" compilation. `0` is defined as interpreted. It is modeled after the `CompLevel` enum in `compiler/compilerDefinitions`: ```cpp // Enumeration to distinguish tiers of compilation enum CompLevel { // ... CompLevel_none = 0, // Interpreter CompLevel_simple = 1, // C1 CompLevel_limited_profile = 2, // C1, invocation & backedge counters CompLevel_full_profile = 3, // C1, invocation & backedge counters + mdo CompLevel_full_optimization = 4 // C2 or JVMCI }; ``` The traces produced by this prototype are fairly large (each frame requires 24 is instead of 16 bytes on 64 bit systems) and some data is duplicated. The reason for this is that it simplified the extension of async-profiler for the prototype, as it only extends the data structures of the original AsyncGetCallTrace API without changing the original fields. Proposal ------------ But packing the information and reducing duplication is of course possible if we step away from the former constraint: ```cpp enum FrameTypeId { FRAME_JAVA = 1, // JIT compiled and interpreted FRAME_JAVA_INLINED = 2, // inlined JIT compiled FRAME_NATIVE = 3, // native wrapper to call C methods from Java FRAME_STUB = 4, // VM generated stubs FRAME_CPP = 5 // C/C++/... frames }; typedef struct { uint8_t type; // frame type uint8_t comp_level; uint16_t bci; // 0 < bci < 65536 jmethodID method_id; } JavaFrame; // used for FRAME_JAVA and FRAME_JAVA_INLINED typedef struct { FrameTypeId type; // single byte type void *machine_pc; } NonJavaFrame; // used for FRAME_NATIVE, FRAME_STUB and FRAME_CPP typedef union { FrameTypeId type; // to distinguish between JavaFrame and NonJavaFrame JavaFrame java_frame; NonJavaFrame non_java_frame; } CallFrame; ``` This uses the same amount of space per frame (16 bytes) as the original but encodes far more information. Best regards Johannes [1] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/jfr/recorder/stacktrace/stackWalker.hpp [2] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.cpp**** [3] https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#CompiledMethodLoad [4] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.hpp From forax at univ-mlv.fr Fri Mar 18 10:39:22 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 18 Mar 2022 11:39:22 +0100 (CET) Subject: Proposal of a new version of AsyncGetCallTrace In-Reply-To: References: Message-ID: <168322961.19060228.1647599962863.JavaMail.zimbra@u-pem.fr> Knowing if there is a C stackframe in the middle of the stack while blocking on a synchronized is an important feature for a profiler when loom will land. R?mi ----- Original Message ----- > From: "Bechberger, Johannes" > To: "hotspot-dev" , hotspot-jfr-dev at openjdk.java.net, "serviceability-dev" > > Sent: Friday, March 18, 2022 10:43:58 AM > Subject: Proposal of a new version of AsyncGetCallTrace > Hi, > > I would like propose to > > 1. Replace duplicated stack walking code with unified API > 2. Create a new version of AsyncGetCallTrace, tentatively called > "AsyncGetCallTrace2", with more information on more frames using the unified > API > > A demo (as well as this text) is available at > https://github.com/parttimenerd/asgct2-demo > if you want to see a prototype of this proposal in action. > > Unify Stack Walking > ================ > > There are currently multiple implementations of stack walking in JFR and for > AsyncGetCallTrace. > They each implement their own extension of vframeStream but with comparable > features > and check for problematic frames. > > My proposal is, therefore, to replace the stack walking code with a unified API > that > includes all error checking and vframeStream extensions in a single place. > The prosposed new class is called StackWalker and could be part of > `jfr/recorder/stacktrace` [1]. > This class also supports getting information on C frames so it can be > potentially > used for walking stacks in VMError (used to create hs_err files), further > reducing the amount of different stack walking code. > > AsyncGetCallTrace2 > ================ > > The AsyncGetCallTrace call has seen increasing use in recent years > in profilers like async-profiler. > But it is not really an API (not exported in any header) and > the information on frames it returns is pretty limited > (only the method and bci for Java frames) which makes implementing > profilers and other tooling harder. Tools like async-profiler > have to resort to complicated code to partially obtain the information > that the JVM already has. > Information that is currently hidden and impossible to obtain is > > - whether a compiled frame is inlined (currently only obtainable for the topmost > compiled frames) > - although this can be obtained using JFR > - C frames that are not at the top of the stack > - compilation level (C1 or C2 compiled) > > This information is helpful when profiling and tuning the VM for > a given application and also for profiling code that uses > JNI heavily. > > Using the proposed StackWalker class, implementing a new API > that returns more information on frames is possible > as a thin wrapper over the StackWalker API [2]. > This also improves the maintainability as the code used > in this API is used in multiple places and is therefore > also better tested than the previous implementation, see > [1] for the implementation. > > The following describes the proposed API: > > ```cpp > void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext); > ``` > > The structure of `CallTrace` is the same as the original > `ASGCT_CallTrace` with the same error codes encoded in <= 0 > values of `num_frames`. > > ```cpp > typedef struct { > JNIEnv *env_id; // Env where trace was recorded > jint num_frames; // number of frames in this trace > CallFrame *frames; // frames > void* frame_info; // more information on frames > } CallTrace; > ``` > > The only difference is that the `frames` array also contains > information on C frames and the field `frame_info`. > The `frame_info` is currently null and can later be used > for extended information on each frame, being an array with > an element for each frame. But the type of the > elements in this array is implementation specific. > This akin to `compile_info` field in JVMTI's CompiledMethodLoad > [3] and used for extending the information returned by the > API later. > > Protoype > ------------ > > Currently `CallFrame` is implemented in the prototype [4] as > > ```cpp > typedef struct { > void *machine_pc; // program counter, for C and native frames (frames > of native methods) > uint8_t type; // frame type (single byte) > uint8_t comp_level; // highest compilation level of a method related to > a Java frame > // information from original CallFrame > jint bci; // bci for Java frames > jmethodID method_id; // method ID for Java frames > } CallFrame; > ``` > > The `FrameTypeId` is based on the frame type in JFRStackFrame: > > ```cpp > enum FrameTypeId { > FRAME_INTERPRETED = 0, > FRAME_JIT = 1, // JIT compiled > FRAME_INLINE = 2, // inlined JITed methods > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_CPP = 4 // c/c++/... frames, stub frames have CompLevel_all > }; > ``` > > The `comp_level` states the compilation level of the method related to the frame > with higher numbers representing "more" compilation. `0` is defined as > interpreted. It is modeled after the `CompLevel` enum in > `compiler/compilerDefinitions`: > > ```cpp > // Enumeration to distinguish tiers of compilation > enum CompLevel { > // ... > CompLevel_none = 0, // Interpreter > CompLevel_simple = 1, // C1 > CompLevel_limited_profile = 2, // C1, invocation & backedge counters > CompLevel_full_profile = 3, // C1, invocation & backedge counters + > mdo > CompLevel_full_optimization = 4 // C2 or JVMCI > }; > ``` > > The traces produced by this prototype are fairly large > (each frame requires 24 is instead of 16 bytes on 64 bit systems) and some data > is > duplicated. > The reason for this is that it simplified the extension of async-profiler > for the prototype, as it only extends the data structures of > the original AsyncGetCallTrace API without changing the original fields. > > Proposal > ------------ > > But packing the information and reducing duplication is of course possible > if we step away from the former constraint: > > ```cpp > enum FrameTypeId { > FRAME_JAVA = 1, // JIT compiled and interpreted > FRAME_JAVA_INLINED = 2, // inlined JIT compiled > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_STUB = 4, // VM generated stubs > FRAME_CPP = 5 // C/C++/... frames > }; > > typedef struct { > uint8_t type; // frame type > uint8_t comp_level; > uint16_t bci; // 0 < bci < 65536 > jmethodID method_id; > } JavaFrame; // used for FRAME_JAVA and FRAME_JAVA_INLINED > > typedef struct { > FrameTypeId type; // single byte type > void *machine_pc; > } NonJavaFrame; // used for FRAME_NATIVE, FRAME_STUB and FRAME_CPP > > typedef union { > FrameTypeId type; // to distinguish between JavaFrame and NonJavaFrame > JavaFrame java_frame; > NonJavaFrame non_java_frame; > } CallFrame; > ``` > > This uses the same amount of space per frame (16 bytes) as the original but > encodes far more information. > > Best regards > Johannes > > [1] > https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/jfr/recorder/stacktrace/stackWalker.hpp > > [2] > https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.cpp**** > > [3] > https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#CompiledMethodLoad > > [4] > https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.hpp From simonis at openjdk.java.net Fri Mar 18 11:05:41 2022 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 18 Mar 2022 11:05:41 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: References: Message-ID: <5swa6Sh7ZklCS1YoTwK3ElGLzEN8-pYW-kTX2bfGDLc=.ddc846c4-34dc-4e81-bd57-6d388ef6c6d6@github.com> On Wed, 16 Mar 2022 09:37:29 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > rename, adding test Hi Boris, Thanks for doing this change. In general it looks good! I only have minor comments and questions which I've added inline. Also, can you please update the Summary of the JBS issue to match that of the PR? I think the Summary of the PR is more adequate because the change also contains some shared changes. And please update the description of the PR and replace "*It changes nothing for any platform besides AARCH*" with something like "*Currently only the aarch64 backend is adapted to make use of these changes*". Because the segments are actually re-orded for all platforms. Thank you and best regards, Volker src/hotspot/cpu/aarch64/aarch64.ad line 1282: > 1280: > 1281: static uint size_exception_handler() { > 1282: return MacroAssembler::far_codestub_branch_size(); Can you please also use this for `size_deopt_handler()` below? I.e. `NativeInstruction::instruction_size() + MacroAssembler::far_codestub_branch_size()`. Also, once you've done this I think you can strengthen the assertions in `HandlerImpl::emit_exception_handler()`/`HandlerImpl::emit_deopt_handler()` from "`<=`" to "`==`". src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 56: > 54: __ ldr(rscratch2, l); > 55: int jump_code_size = __ far_jump(ExternalAddress(entry_point)); > 56: // IC stub code size is not expected to vary depending on target address. Does the new code still align `cached_value` on a `wordSize` boundary as this was ensured before by `align(wordsize)`? I think that's only true if `code_begin` is guaranteed to start at a `wordSize` boundary because `far_jump` is either one or three instructions (plus one `ldr` instruction). If yes, please add a comment explaining that. Otherwise explain why the alignment isn't necessary anymore. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 399: > 397: } > 398: // codecache size: 128M..240M > 399: return !CodeCache::is_non_nmethod(addr); Is it possible to further refine this to also catch calls from C1 to C1 and C2 to C2 which obviously wouldn't need a far call as well? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 428: > 426: uint64_t offset; > 427: // We can use ADRP here because we know that the total size of > 428: // the code cache cannot exceed 2Gb. Not directly related to your change, but what's correct here: - the comment which says "code cache can't exceed 2gb" - the assertion above which asserts `ReservedCodeCacheSize < 4*G` Maybe you can fix this while you're on it? src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1071: > 1069: address trampoline_call(Address entry, CodeBuffer* cbuf = NULL); > 1070: > 1071: // Jumps that can reach anywhere in the code cache. Not sure why you've moved this comment from the `far_call()`/`far_jump()` functions here? Please move back. If you want to add a comment to this function it could be something like `Check if generic branches in the code cache require a far jump` src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1075: > 1073: } > 1074: > 1075: // Jumps that can reach anywhere in the code cache. Same as before. Move the original comment down to the `far_call()`/`far_jump()` functions as before. src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1076: > 1074: } > 1075: > 1076: // Jumps that can reach a nonmethod stub This should read something like `Check if branches to the the non nmethod section require a far jump` src/hotspot/share/code/codeCache.cpp line 306: > 304: // Profiled nmethods > 305: // Non-nmethods > 306: // Non-profiled nmethods In the JBS issue and in the description of this pull request you say that you *move the non-nmethod segment between the two other code segments* (i.e. low address -> profiled -> non-nmethod -> non-profiled) but here you also change the order of "profiled" and "non-profiled" (i.e. in your code "non-profiled" comes first). Is there any reason for doing this, instead of just moving the "non-nmethod" segment between the "profiled" and "non-profiled" segments as described in the JBS issue and in the PR description? ------------- Changes requested by simonis (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7517 From duke at openjdk.java.net Fri Mar 18 13:01:34 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Fri, 18 Mar 2022 13:01:34 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: <5swa6Sh7ZklCS1YoTwK3ElGLzEN8-pYW-kTX2bfGDLc=.ddc846c4-34dc-4e81-bd57-6d388ef6c6d6@github.com> References: <5swa6Sh7ZklCS1YoTwK3ElGLzEN8-pYW-kTX2bfGDLc=.ddc846c4-34dc-4e81-bd57-6d388ef6c6d6@github.com> Message-ID: On Fri, 18 Mar 2022 10:23:12 GMT, Volker Simonis wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> rename, adding test > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 399: > >> 397: } >> 398: // codecache size: 128M..240M >> 399: return !CodeCache::is_non_nmethod(addr); > > Is it possible to further refine this to also catch calls from C1 to C1 and C2 to C2 which obviously wouldn't need a far call as well? I believe they should be our next steps to guarantee we don't generate redundant code for such cases. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From tschatzl at openjdk.java.net Fri Mar 18 14:07:58 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 18 Mar 2022 14:07:58 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 Message-ID: Hi all, can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. It's based on `RegSet` from AArch64. Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). Thanks, Thomas ------------- Commit messages: - Single stack change - Reserve space on stack only for necessary parts of xmm registers - More refactoring - Store floats with usesse==1 too - Reverse pop - add reverseregiterator - minor refactoring - More fixes - build fixes - Clenaups - ... and 1 more: https://git.openjdk.java.net/jdk/compare/69e4e338...1be201b1 Changes: https://git.openjdk.java.net/jdk/pull/7867/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7867&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283327 Stats: 543 lines in 9 files changed: 396 ins; 126 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/7867.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7867/head:pull/7867 PR: https://git.openjdk.java.net/jdk/pull/7867 From eosterlund at openjdk.java.net Fri Mar 18 15:56:29 2022 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 18 Mar 2022 15:56:29 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 14:00:34 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. > > Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. > > There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. > > It's based on `RegSet` from AArch64. > > Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). > > Thanks, > Thomas src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3599: > 3597: #if defined(WINDOWS) && defined(_LP64) > 3598: XMMRegSet result = XMMRegSet::range(xmm0, xmm5); > 3599: if (FrameMap::get_num_caller_save_xmms() > 16) { Why is this part needed? ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From tschatzl at openjdk.java.net Fri Mar 18 17:34:28 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 18 Mar 2022 17:34:28 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 In-Reply-To: References: Message-ID: <8wMqvvjXY2J0gSzXqmPDCAeWHw98dFTBGkIcAd7HWZE=.8958f42d-8590-4c46-a408-b3daead4872d@github.com> On Fri, 18 Mar 2022 15:52:52 GMT, Erik ?sterlund wrote: >> Hi all, >> >> can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. >> >> Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. >> >> There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. >> >> It's based on `RegSet` from AArch64. >> >> Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). >> >> Thanks, >> Thomas > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3599: > >> 3597: #if defined(WINDOWS) && defined(_LP64) >> 3598: XMMRegSet result = XMMRegSet::range(xmm0, xmm5); >> 3599: if (FrameMap::get_num_caller_save_xmms() > 16) { > > Why is this part needed? In the x64 Windows ABI xmm0-5 and xmm16-31 are volatile, i.e. caller save. That's why, unlike in the SysV ABI, xmm6-15 are not in the set, and hence not saved by the caller. The code says: if we have more than 16 xmm registers (AVX), then add xmm16 to 31 to the set of caller saved registers. If we added xmm16 to 31 unconditionally, the code generator would complain that xmm16 to 31 are not available with e.g. SSE2 only. Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From tschatzl at openjdk.java.net Fri Mar 18 17:38:27 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 18 Mar 2022 17:38:27 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 In-Reply-To: <8wMqvvjXY2J0gSzXqmPDCAeWHw98dFTBGkIcAd7HWZE=.8958f42d-8590-4c46-a408-b3daead4872d@github.com> References: <8wMqvvjXY2J0gSzXqmPDCAeWHw98dFTBGkIcAd7HWZE=.8958f42d-8590-4c46-a408-b3daead4872d@github.com> Message-ID: On Fri, 18 Mar 2022 17:31:36 GMT, Thomas Schatzl wrote: >> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3599: >> >>> 3597: #if defined(WINDOWS) && defined(_LP64) >>> 3598: XMMRegSet result = XMMRegSet::range(xmm0, xmm5); >>> 3599: if (FrameMap::get_num_caller_save_xmms() > 16) { >> >> Why is this part needed? > > In the x64 Windows ABI xmm0-5 and xmm16-31 are volatile, i.e. caller save. That's why, unlike in the SysV ABI, xmm6-15 are not in the set, and hence not saved by the caller. > > The code says: if we have more than 16 xmm registers (AVX), then add xmm16 to 31 to the set of caller saved registers. > If we added xmm16 to 31 unconditionally, the code generator would complain that xmm16 to 31 are not available with e.g. SSE2 only. > > Here's the relevant part of the spec: > >> The x64 ABI considers the registers RAX, RCX, RDX, R8, R9, R10, R11, and XMM0-XMM5 volatile. When present, the upper portions of YMM0-YMM15 and ZMM0-ZMM15 are also volatile. On AVX512VL, the ZMM, YMM, and XMM registers 16-31 are also volatile. Consider volatile registers destroyed on function calls unless otherwise safety-provable by analysis such as whole program optimization. > > > Thomas If you look below for the SysV ABI, in the '#else' part, the code unconditionally adds all xmm registers that are available: XMMRegSet::range(xmm0, as_XMMRegister(FrameMap::get_num_caller_save_xmms() - 1)); If you want we can do the same for Win64 if you think this is better. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From tschatzl at openjdk.java.net Fri Mar 18 17:42:34 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 18 Mar 2022 17:42:34 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 In-Reply-To: References: <8wMqvvjXY2J0gSzXqmPDCAeWHw98dFTBGkIcAd7HWZE=.8958f42d-8590-4c46-a408-b3daead4872d@github.com> Message-ID: <6L3she5soe794dYxzMulGCZTbBUYzerkbjZWYH1VkY8=.9cbdfa6a-e1a6-4b1f-9fcf-0da3fb9bd65d@github.com> On Fri, 18 Mar 2022 17:35:17 GMT, Thomas Schatzl wrote: >> In the x64 Windows ABI xmm0-5 and xmm16-31 are volatile, i.e. caller save. That's why, unlike in the SysV ABI, xmm6-15 are not in the set, and hence not saved by the caller. >> >> The code says: if we have more than 16 xmm registers (AVX), then add xmm16 to 31 to the set of caller saved registers. >> If we added xmm16 to 31 unconditionally, the code generator would complain that xmm16 to 31 are not available with e.g. SSE2 only. >> >> Here's the relevant part of the spec: >> >>> The x64 ABI considers the registers RAX, RCX, RDX, R8, R9, R10, R11, and XMM0-XMM5 volatile. When present, the upper portions of YMM0-YMM15 and ZMM0-ZMM15 are also volatile. On AVX512VL, the ZMM, YMM, and XMM registers 16-31 are also volatile. Consider volatile registers destroyed on function calls unless otherwise safety-provable by analysis such as whole program optimization. >> >> >> Thomas > > If you look below for the SysV ABI, in the '#else' part, the code unconditionally adds all xmm registers that are available: > > > XMMRegSet::range(xmm0, as_XMMRegister(FrameMap::get_num_caller_save_xmms() - 1)); > > If you want we can do the same for Win64 if you think this is better. Of course, if C1 never uses xmm >= 8, then I can limit the number of register saving. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From eosterlund at openjdk.java.net Fri Mar 18 18:41:31 2022 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 18 Mar 2022 18:41:31 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 In-Reply-To: References: Message-ID: <3QVAkJBk6t3nSzYL77_8JfYZm_uWoS-A8higAOCJKFA=.c5c19d83-6e0f-4f6a-8262-137995c74681@github.com> On Fri, 18 Mar 2022 14:00:34 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. > > Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. > > There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. > > It's based on `RegSet` from AArch64. > > Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). > > Thanks, > Thomas Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7867 From eosterlund at openjdk.java.net Fri Mar 18 18:41:32 2022 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 18 Mar 2022 18:41:32 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code/improve push pop stuff for x86 In-Reply-To: <6L3she5soe794dYxzMulGCZTbBUYzerkbjZWYH1VkY8=.9cbdfa6a-e1a6-4b1f-9fcf-0da3fb9bd65d@github.com> References: <8wMqvvjXY2J0gSzXqmPDCAeWHw98dFTBGkIcAd7HWZE=.8958f42d-8590-4c46-a408-b3daead4872d@github.com> <6L3she5soe794dYxzMulGCZTbBUYzerkbjZWYH1VkY8=.9cbdfa6a-e1a6-4b1f-9fcf-0da3fb9bd65d@github.com> Message-ID: On Fri, 18 Mar 2022 17:38:57 GMT, Thomas Schatzl wrote: >> If you look below for the SysV ABI, in the '#else' part, the code unconditionally adds all xmm registers that are available: >> >> >> XMMRegSet::range(xmm0, as_XMMRegister(FrameMap::get_num_caller_save_xmms() - 1)); >> >> If you want we can do the same for Win64 if you think this is better. > > Of course, if C1 never uses xmm >= 8, then I can limit the number of register saving. Pretty sure C1 will use more than that. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From jbhateja at openjdk.java.net Fri Mar 18 20:19:08 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 18 Mar 2022 20:19:08 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v18] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > > Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio > -- | -- | -- | -- | -- | -- | -- | -- > FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 > FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 > FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - 8279508: Using an explicit scratch register since rscratch1 is bound to r10 and its usage is transparent to compiler. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 - 8279508: Windows build failure fix. - 8279508: Styling comments resolved. - 8279508: Creating separate test for round double under feature check. - 8279508: Reducing the invocation count and compile thresholds for RoundTests.java. - 8279508: Review comments resolution. - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 - 8279508: Removing +LogCompilation flag. - ... and 12 more: https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf ------------- Changes: https://git.openjdk.java.net/jdk/pull/7094/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=17 Stats: 800 lines in 25 files changed: 707 ins; 30 del; 63 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Fri Mar 18 20:19:10 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 18 Mar 2022 20:19:10 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v17] In-Reply-To: <1J7RFTiEF7VfaEg4EF29Hwd9UUU0D1MM1xh6waG3ulY=.251d7fd9-0d1d-4288-9a55-6feca4b0ec6a@github.com> References: <1J7RFTiEF7VfaEg4EF29Hwd9UUU0D1MM1xh6waG3ulY=.251d7fd9-0d1d-4288-9a55-6feca4b0ec6a@github.com> Message-ID: <9OJ2oXsQoXjxikba14rEmZMby_rRgq5yzeiwMkk0AMk=.07f34037-6510-453a-bb63-1f5ab162a530@github.com> On Mon, 14 Mar 2022 10:35:58 GMT, Tobias Hartmann wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> 8279508: Windows build failure fix. > > `compiler/c2/cr6340864/TestFloatVect.java` and `TestDoubleVect.java` fail on Windows: > > > # A fatal error has been detected by the Java Runtime Environment: > # > # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000001971b940123, pid=56524, tid=57368 > # > # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 19-internal-2022-03-14-0834080.tobias.hartmann.jdk2) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 19-internal-2022-03-14-0834080.tobias.hartmann.jdk2, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64) > # Problematic frame: > # J 205 c2 compiler.c2.cr6340864.TestFloatVect.test_round([I[F)V (24 bytes) @ 0x000001971b940123 [0x000001971b93ffe0+0x0000000000000143] Hi @TobiHartmann , Can you kindly regress latest changes through your test infrastructure Hi @theRealAph , Your suggestions incorporated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dlong at openjdk.java.net Fri Mar 18 22:56:29 2022 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 18 Mar 2022 22:56:29 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 14:00:34 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. > > Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. > > There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. > > It's based on `RegSet` from AArch64. > > Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). > > Thanks, > Thomas src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 325: > 323: __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_post_entry), card_addr, r15_thread); > 324: #else > 325: __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_post_entry), card_addr, thread); It looks like this #else part will work for the _LP64 case, so we don't need the #ifdef anymore. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From dlong at openjdk.java.net Fri Mar 18 23:17:30 2022 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 18 Mar 2022 23:17:30 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code In-Reply-To: References: Message-ID: <9tYjxw6pHU2D24S8ahHy6c8KFhn4kEnLOkNUaHzmwtk=.81b4fcce-20a9-4e03-a241-38d2b436bbbe@github.com> On Fri, 18 Mar 2022 14:00:34 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. > > Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. > > There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. > > It's based on `RegSet` from AArch64. > > Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). > > Thanks, > Thomas The interpreter probably doesn't need to preserve more than 1 floating pointer register. Do we care about optimizing for that case? ------------- Marked as reviewed by dlong (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7867 From dlong at openjdk.java.net Sat Mar 19 00:31:34 2022 From: dlong at openjdk.java.net (Dean Long) Date: Sat, 19 Mar 2022 00:31:34 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 08:27:25 GMT, Jie Fu wrote: > Hi all, > > As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. > It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. > Please review it. > > Thanks. > Best regards, > Jie Looks good to me. ------------- Marked as reviewed by dlong (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7851 From jiefu at openjdk.java.net Sat Mar 19 02:49:28 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Sat, 19 Mar 2022 02:49:28 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:29:22 GMT, Xiaohong Gong wrote: > The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. > And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. > > To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: > - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. > - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. > > So with this patch, the following transformations are applied: > > For non-masked negation with NEON: > > movi v16.4s, #0x0 > sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s > > and with SVE: > > mov z16.s, #0 > sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s > > For masked negation with NEON: > > movi v17.4s, #0x1 > mvn v19.16b, v18.16b > mov v20.16b, v16.16b ==> neg v18.4s, v17.4s > bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b > add v19.4s, v20.4s, v17.4s > mov v18.16b, v16.16b > bsl v18.16b, v19.16b, v20.16b > > and with SVE: > > mov z16.s, #-1 > mov z17.s, #1 ==> neg z16.s, p0/m, z16.s > eor z18.s, p0/m, z18.s, z16.s > add z18.s, p0/m, z18.s, z17.s > > Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): > > NEON: > Benchmark Gain > Byte128Vector.NEG 1.029 > Byte128Vector.NEGMasked 1.757 > Short128Vector.NEG 1.041 > Short128Vector.NEGMasked 1.659 > Int128Vector.NEG 1.005 > Int128Vector.NEGMasked 1.513 > Long128Vector.NEG 1.003 > Long128Vector.NEGMasked 1.878 > > SVE with 512-bits: > Benchmark Gain > ByteMaxVector.NEG 1.10 > ByteMaxVector.NEGMasked 1.165 > ShortMaxVector.NEG 1.056 > ShortMaxVector.NEGMasked 1.195 > IntMaxVector.NEG 1.002 > IntMaxVector.NEGMasked 1.239 > LongMaxVector.NEG 1.031 > LongMaxVector.NEGMasked 1.191 > > X86 (non AVX-512): > Benchmark Gain > ByteMaxVector.NEGMasked 1.254 > ShortMaxVector.NEGMasked 1.359 > IntMaxVector.NEGMasked 1.431 > LongMaxVector.NEGMasked 1.989 > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 src/hotspot/share/opto/vectorIntrinsics.cpp line 209: > 207: #ifndef PRODUCT > 208: if (C->print_intrinsics()) { > 209: tty->print_cr(" ** Rejected vector op (%s,%s,%d) because architecture does not support variable vector negate", "variable vector negate" seems a bit strange to me. How about removing "variable"? src/hotspot/share/opto/vectorIntrinsics.cpp line 291: > 289: if ((mask_use_type & VecMaskUsePred) != 0) { > 290: if (!Matcher::has_predicated_vectors()) { > 291: return false; If we return here, we would miss the intrinsic failing msg "Rejected vector mask predicate using ...", right? src/hotspot/share/opto/vectornode.cpp line 141: > 139: case T_BYTE: > 140: case T_SHORT: > 141: case T_INT: return Op_NegVI; Why not add `Op_NegVB` for `BYTE` and `Op_NegVS` for `SHORT`? Is there any performance drop for byte/short negation operation if both of them are handled as a NegVI vector? src/hotspot/share/opto/vectornode.cpp line 1635: > 1633: } > 1634: > 1635: Node* NegVINode::Ideal(PhaseGVN* phase, bool can_reshape) { Much duplication in `NegVINode::Ideal` and `NegVLNode::Ideal`. Is it possible to refactor the implementation? ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From jiefu at openjdk.java.net Sat Mar 19 03:14:26 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Sat, 19 Mar 2022 03:14:26 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> Message-ID: On Tue, 15 Mar 2022 02:47:20 GMT, Xiaohong Gong wrote: > Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: > > double negate(double arg) {return -0.0 - arg; } > > This is to handle signed zeros correctly. This seems easy to be broken by an opt enhancement. Just wondering do we have a jtreg test for this point? @jddarcy Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From tanksherman27 at gmail.com Sat Mar 19 09:16:35 2022 From: tanksherman27 at gmail.com (Julian Waters) Date: Sat, 19 Mar 2022 17:16:35 +0800 Subject: Review of JEP draft In-Reply-To: References: Message-ID: Hi David, I was under the impression that the reason the standard for introducing new opcodes is so stringent was partially due to the constraints imposed by the current single byte scheme. Is that not the case in practice? In any case, I'll keep this JEP shelved in case it may be of use in the future. best regards, Julian On Fri, Mar 18, 2022 at 8:30 AM David Holmes wrote: > Hi Julian, > > On 17/03/2022 10:19 pm, Julian Waters wrote: > > Hi everyone, > > > > If you don't mind a little reading, can I get a review of the following > JEP > > draft at > > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC > ? > > Apologies if this is not the proper way to submit a JEP, I'm a little new > > to this. > > The bar is set very, very high, for introducing new bytecodes and > consequently running out of them has not been a problem in practice. The > reason the bar has been set so high is because the impact of a new > bytecode on the whole Java ecosystem is enormous. Numerous new features > have considered the possibility of adding a new bytecode, but very few > have actually done so, instead flexible mechanisms like invokeDynamic, > were introduced, that could then be used to implement a range of other > features. > > If we had almost no spare bytecodes left, and we regularly added new > bytecodes, then this would be a problem that needs solving. But as it > stands I don't see a real problem that needs solving here. > > YMMV. > > Cheers, > David > > > Have a great day! > > > > best regards, > > Julian > From aturbanov at openjdk.java.net Sun Mar 20 13:50:42 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Sun, 20 Mar 2022 13:50:42 GMT Subject: RFR: 8283426: Fix 'exeption' typo Message-ID: Fix repeated type `exeption` ------------- Commit messages: - [PATCH] Typo 'Exeption' instead of 'Exception' Changes: https://git.openjdk.java.net/jdk/pull/7879/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283426 Stats: 24 lines in 10 files changed: 0 ins; 2 del; 22 mod Patch: https://git.openjdk.java.net/jdk/pull/7879.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7879/head:pull/7879 PR: https://git.openjdk.java.net/jdk/pull/7879 From xuelei at openjdk.java.net Sun Mar 20 14:41:37 2022 From: xuelei at openjdk.java.net (Xue-Lei Andrew Fan) Date: Sun, 20 Mar 2022 14:41:37 GMT Subject: RFR: 8283426: Fix 'exeption' typo In-Reply-To: References: Message-ID: <_PAkHa5ReaDzrQiBnAEskX-GEe8viZdHkXfcUgMUBro=.f8aa01f8-13cf-4847-8f51-00e3b3e10516@github.com> On Sun, 20 Mar 2022 13:30:01 GMT, Andrey Turbanov wrote: > Fix repeated type `exeption` Looks good to me. Thanks! ------------- Marked as reviewed by xuelei (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7879 From iris at openjdk.java.net Sun Mar 20 14:57:22 2022 From: iris at openjdk.java.net (Iris Clark) Date: Sun, 20 Mar 2022 14:57:22 GMT Subject: RFR: 8283426: Fix 'exeption' typo In-Reply-To: References: Message-ID: On Sun, 20 Mar 2022 13:30:01 GMT, Andrey Turbanov wrote: > Fix repeated type `exeption` Marked as reviewed by iris (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7879 From david.holmes at oracle.com Sun Mar 20 22:24:30 2022 From: david.holmes at oracle.com (David Holmes) Date: Mon, 21 Mar 2022 08:24:30 +1000 Subject: Review of JEP draft In-Reply-To: References: Message-ID: <8db5aae2-3e12-1516-c97f-02a7cda899cd@oracle.com> On 19/03/2022 7:16 pm, Julian Waters wrote: > Hi David, > > I was under the impression that the reason the standard for > introducing?new opcodes is so stringent was partially due to the > constraints imposed by the current?single byte scheme. Is that not the > case in practice? I think even without limited bytecodes the costs of adding new bytecodes are high enough for it to only be considered in very worthy cases. I was trying to find a good, old, article that talks about the pros and cons of growing the bytecode but alas I could not. But some presentations from John Rose do touch on some aspects of this: http://cr.openjdk.java.net/~jrose/pres/ Cheers, David > In any case, I'll keep this JEP shelved in case it may > be of use in the future. > > best regards, > Julian > > On Fri, Mar 18, 2022 at 8:30 AM David Holmes > wrote: > > Hi Julian, > > On 17/03/2022 10:19 pm, Julian Waters wrote: > > Hi everyone, > > > > If you don't mind a little reading, can I get a review of the > following JEP > > draft at > > > https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8283291?filter=allissues&orderby=created+DESC%2C+priority+DESC%2C+updated+DESC > ? > > Apologies if this is not the proper way to submit a JEP, I'm a > little new > > to this. > > The bar is set very, very high, for introducing new bytecodes and > consequently running out of them has not been a problem in practice. > The > reason the bar has been set so high is because the impact of a new > bytecode on the whole Java ecosystem is enormous. Numerous new features > have considered the possibility of adding a new bytecode, but very few > have actually done so, instead flexible mechanisms like invokeDynamic, > were introduced, that could then be used to implement a range of other > features. > > If we had almost no spare bytecodes left, and we regularly added new > bytecodes, then this would be a problem that needs solving. But as it > stands I don't see a real problem that needs solving here. > > YMMV. > > Cheers, > David > > > Have a great day! > > > > best regards, > > Julian > From david.holmes at oracle.com Sun Mar 20 22:41:17 2022 From: david.holmes at oracle.com (David Holmes) Date: Mon, 21 Mar 2022 08:41:17 +1000 Subject: Proposal of a new version of AsyncGetCallTrace In-Reply-To: References: Message-ID: Hi Johannes, On 18/03/2022 7:43 pm, Bechberger, Johannes wrote: > Hi, > > I would like propose to > > 1. Replace duplicated stack walking code with unified API > 2. Create a new version of AsyncGetCallTrace, tentatively called "AsyncGetCallTrace2", with more information on more frames using the unified API > > A demo (as well as this text) is available at https://github.com/parttimenerd/asgct2-demo > if you want to see a prototype of this proposal in action. > > Unify Stack Walking > ================ > > There are currently multiple implementations of stack walking in JFR and for AsyncGetCallTrace. > They each implement their own extension of vframeStream but with comparable features > and check for problematic frames. > > My proposal is, therefore, to replace the stack walking code with a unified API that > includes all error checking and vframeStream extensions in a single place. > The prosposed new class is called StackWalker and could be part of > `jfr/recorder/stacktrace` [1]. So we already have the StackWalker API provided at the Java level and with the implementation in the VM in src/hotspot/share/prims/stackwalk.cpp. How does that fit in with what you propose? Cheers, David ----- > This class also supports getting information on C frames so it can be potentially > used for walking stacks in VMError (used to create hs_err files), further > reducing the amount of different stack walking code. > > AsyncGetCallTrace2 > ================ > > The AsyncGetCallTrace call has seen increasing use in recent years > in profilers like async-profiler. > But it is not really an API (not exported in any header) and > the information on frames it returns is pretty limited > (only the method and bci for Java frames) which makes implementing > profilers and other tooling harder. Tools like async-profiler > have to resort to complicated code to partially obtain the information > that the JVM already has. > Information that is currently hidden and impossible to obtain is > > - whether a compiled frame is inlined (currently only obtainable for the topmost compiled frames) > - although this can be obtained using JFR > - C frames that are not at the top of the stack > - compilation level (C1 or C2 compiled) > > This information is helpful when profiling and tuning the VM for > a given application and also for profiling code that uses > JNI heavily. > > Using the proposed StackWalker class, implementing a new API > that returns more information on frames is possible > as a thin wrapper over the StackWalker API [2]. > This also improves the maintainability as the code used > in this API is used in multiple places and is therefore > also better tested than the previous implementation, see > [1] for the implementation. > > The following describes the proposed API: > > ```cpp > void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext); > ``` > > The structure of `CallTrace` is the same as the original > `ASGCT_CallTrace` with the same error codes encoded in <= 0 > values of `num_frames`. > > ```cpp > typedef struct { > JNIEnv *env_id; // Env where trace was recorded > jint num_frames; // number of frames in this trace > CallFrame *frames; // frames > void* frame_info; // more information on frames > } CallTrace; > ``` > > The only difference is that the `frames` array also contains > information on C frames and the field `frame_info`. > The `frame_info` is currently null and can later be used > for extended information on each frame, being an array with > an element for each frame. But the type of the > elements in this array is implementation specific. > This akin to `compile_info` field in JVMTI's CompiledMethodLoad > [3] and used for extending the information returned by the > API later. > > Protoype > ------------ > > Currently `CallFrame` is implemented in the prototype [4] as > > ```cpp > typedef struct { > void *machine_pc; // program counter, for C and native frames (frames of native methods) > uint8_t type; // frame type (single byte) > uint8_t comp_level; // highest compilation level of a method related to a Java frame > // information from original CallFrame > jint bci; // bci for Java frames > jmethodID method_id; // method ID for Java frames > } CallFrame; > ``` > > The `FrameTypeId` is based on the frame type in JFRStackFrame: > > ```cpp > enum FrameTypeId { > FRAME_INTERPRETED = 0, > FRAME_JIT = 1, // JIT compiled > FRAME_INLINE = 2, // inlined JITed methods > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_CPP = 4 // c/c++/... frames, stub frames have CompLevel_all > }; > ``` > > The `comp_level` states the compilation level of the method related to the frame > with higher numbers representing "more" compilation. `0` is defined as > interpreted. It is modeled after the `CompLevel` enum in `compiler/compilerDefinitions`: > > ```cpp > // Enumeration to distinguish tiers of compilation > enum CompLevel { > // ... > CompLevel_none = 0, // Interpreter > CompLevel_simple = 1, // C1 > CompLevel_limited_profile = 2, // C1, invocation & backedge counters > CompLevel_full_profile = 3, // C1, invocation & backedge counters + mdo > CompLevel_full_optimization = 4 // C2 or JVMCI > }; > ``` > > The traces produced by this prototype are fairly large > (each frame requires 24 is instead of 16 bytes on 64 bit systems) and some data is > duplicated. > The reason for this is that it simplified the extension of async-profiler > for the prototype, as it only extends the data structures of > the original AsyncGetCallTrace API without changing the original fields. > > Proposal > ------------ > > But packing the information and reducing duplication is of course possible > if we step away from the former constraint: > > ```cpp > enum FrameTypeId { > FRAME_JAVA = 1, // JIT compiled and interpreted > FRAME_JAVA_INLINED = 2, // inlined JIT compiled > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_STUB = 4, // VM generated stubs > FRAME_CPP = 5 // C/C++/... frames > }; > > typedef struct { > uint8_t type; // frame type > uint8_t comp_level; > uint16_t bci; // 0 < bci < 65536 > jmethodID method_id; > } JavaFrame; // used for FRAME_JAVA and FRAME_JAVA_INLINED > > typedef struct { > FrameTypeId type; // single byte type > void *machine_pc; > } NonJavaFrame; // used for FRAME_NATIVE, FRAME_STUB and FRAME_CPP > > typedef union { > FrameTypeId type; // to distinguish between JavaFrame and NonJavaFrame > JavaFrame java_frame; > NonJavaFrame non_java_frame; > } CallFrame; > ``` > > This uses the same amount of space per frame (16 bytes) as the original but encodes far more information. > > Best regards > Johannes > > [1] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/jfr/recorder/stacktrace/stackWalker.hpp > > [2] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.cpp**** > > [3] https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#CompiledMethodLoad > > [4] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.hpp > From david.holmes at oracle.com Sun Mar 20 22:48:00 2022 From: david.holmes at oracle.com (David Holmes) Date: Mon, 21 Mar 2022 08:48:00 +1000 Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On 18/03/2022 5:21 pm, Johannes Bechberger wrote: > On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: > >>> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >>> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. >> >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove two unnecessary lines > > This is not the point: It comes down to API design. If we use SafeFetch in os::is_first_C_frame (and thereby in frame::link_or_null) and not just in ASGCT, then it depends on when the other methods can be called. These methods are e.g. used whenever an error happens and a hs_err file is generated. We cannot guarantee that a JavaThread is always present there. My comment was specifically in response to your statement: > I see AsyncGetCallTrace to assume the JavaThread very soon But AGCT is only intended to ever be called on JavaThreads. David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Mon Mar 21 00:56:30 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Mar 2022 00:56:30 GMT Subject: RFR: 8283426: Fix 'exeption' typo In-Reply-To: References: Message-ID: On Sun, 20 Mar 2022 13:30:01 GMT, Andrey Turbanov wrote: > Fix repeated typo `exeption` Looks good. Thanks for cleaning this up. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7879 From xgong at openjdk.java.net Mon Mar 21 01:24:40 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 21 Mar 2022 01:24:40 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: Message-ID: On Sat, 19 Mar 2022 02:34:55 GMT, Jie Fu wrote: >> The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. >> And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. >> >> To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: >> - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. >> - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. >> >> So with this patch, the following transformations are applied: >> >> For non-masked negation with NEON: >> >> movi v16.4s, #0x0 >> sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s >> >> and with SVE: >> >> mov z16.s, #0 >> sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s >> >> For masked negation with NEON: >> >> movi v17.4s, #0x1 >> mvn v19.16b, v18.16b >> mov v20.16b, v16.16b ==> neg v18.4s, v17.4s >> bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b >> add v19.4s, v20.4s, v17.4s >> mov v18.16b, v16.16b >> bsl v18.16b, v19.16b, v20.16b >> >> and with SVE: >> >> mov z16.s, #-1 >> mov z17.s, #1 ==> neg z16.s, p0/m, z16.s >> eor z18.s, p0/m, z18.s, z16.s >> add z18.s, p0/m, z18.s, z17.s >> >> Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): >> >> NEON: >> Benchmark Gain >> Byte128Vector.NEG 1.029 >> Byte128Vector.NEGMasked 1.757 >> Short128Vector.NEG 1.041 >> Short128Vector.NEGMasked 1.659 >> Int128Vector.NEG 1.005 >> Int128Vector.NEGMasked 1.513 >> Long128Vector.NEG 1.003 >> Long128Vector.NEGMasked 1.878 >> >> SVE with 512-bits: >> Benchmark Gain >> ByteMaxVector.NEG 1.10 >> ByteMaxVector.NEGMasked 1.165 >> ShortMaxVector.NEG 1.056 >> ShortMaxVector.NEGMasked 1.195 >> IntMaxVector.NEG 1.002 >> IntMaxVector.NEGMasked 1.239 >> LongMaxVector.NEG 1.031 >> LongMaxVector.NEGMasked 1.191 >> >> X86 (non AVX-512): >> Benchmark Gain >> ByteMaxVector.NEGMasked 1.254 >> ShortMaxVector.NEGMasked 1.359 >> IntMaxVector.NEGMasked 1.431 >> LongMaxVector.NEGMasked 1.989 >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 > > src/hotspot/share/opto/vectornode.cpp line 141: > >> 139: case T_BYTE: >> 140: case T_SHORT: >> 141: case T_INT: return Op_NegVI; > > Why not add `Op_NegVB` for `BYTE` and `Op_NegVS` for `SHORT`? > Is there any performance drop for byte/short negation operation if both of them are handled as a NegVI vector? The compiler can get the real type info from `Op_NegVI` that can also handle the `BYTE ` and `SHORT ` basic type. I just don't want to add more new IRs which also need more match rules in the ad files. > Is there any performance drop for byte/short negation operation if both of them are handled as a NegVI vector? >From the benchmark results I showed in the commit message, I didn't see not any performance drop for byte/short. Thanks! > src/hotspot/share/opto/vectornode.cpp line 1635: > >> 1633: } >> 1634: >> 1635: Node* NegVINode::Ideal(PhaseGVN* phase, bool can_reshape) { > > Much duplication in `NegVINode::Ideal` and `NegVLNode::Ideal`. > Is it possible to refactor the implementation? Yeah, maybe we need a superclass for `NegVINode` and `NegVLNode`. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From wetmore at openjdk.java.net Mon Mar 21 04:49:33 2022 From: wetmore at openjdk.java.net (Bradford Wetmore) Date: Mon, 21 Mar 2022 04:49:33 GMT Subject: RFR: 8283426: Fix 'exeption' typo In-Reply-To: References: Message-ID: On Sun, 20 Mar 2022 13:30:01 GMT, Andrey Turbanov wrote: > Fix repeated typo `exeption` Good grief! I wouldn't have expected it to be so widespread. Thanks for noticing and fixing. ------------- Marked as reviewed by wetmore (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7879 From aivanov at openjdk.java.net Mon Mar 21 07:26:47 2022 From: aivanov at openjdk.java.net (Alexey Ivanov) Date: Mon, 21 Mar 2022 07:26:47 GMT Subject: RFR: 8283426: Fix 'exeption' typo In-Reply-To: References: Message-ID: On Sun, 20 Mar 2022 13:30:01 GMT, Andrey Turbanov wrote: > Fix repeated typo `exeption` Marked as reviewed by aivanov (Reviewer). src/java.desktop/macosx/classes/sun/lwawt/macosx/CPlatformEmbeddedFrame.java line 201: > 199: /* > 200: * The method could not be implemented due to CALayer restrictions. > 201: * The exception enforce clients not to use it. Suggestion: * The exception enforces clients not to use it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7879 From fyang at openjdk.java.net Mon Mar 21 08:08:10 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Mon, 21 Mar 2022 08:08:10 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port Message-ID: This PR implements JEP 422: Linux/RISC-V Port [1]. The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. [1] https://openjdk.java.net/jeps/422 ------------- Commit messages: - 8276799: Implementation of JEP 422: Linux/RISC-V Port Changes: https://git.openjdk.java.net/jdk/pull/6294/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8276799 Stats: 59153 lines in 188 files changed: 59002 ins; 54 del; 97 mod Patch: https://git.openjdk.java.net/jdk/pull/6294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6294/head:pull/6294 PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Mon Mar 21 08:08:11 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Mon, 21 Mar 2022 08:08:11 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port In-Reply-To: References: Message-ID: On Mon, 8 Nov 2021 11:17:47 GMT, Fei Yang wrote: > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 Rebased to master keep alive ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From tschatzl at openjdk.java.net Mon Mar 21 08:29:34 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 08:29:34 GMT Subject: RFR: 8283186: Explicitly pass a third temp register to MacroAssembler::store_heap_oop In-Reply-To: References: Message-ID: On Tue, 15 Mar 2022 15:32:41 GMT, Erik ?sterlund wrote: >> Hi all, >> >> can I have reviews for this change that explicitly passes a third temp parameter to `MacroAssembler::store_heap_oop` so that `G1BarrierSetAssembler::oop_store_at` (and the equivalent Shenandoah code) does not need to invent some out of thin air? This makes the code much less surprising. >> >> The interesting part of this change is probably the first hunk in `src/hotspot/cpu/x86/templateTable_x86.cpp`, the rest is just passing on that additional parameter. >> >> Testing: gha >> >> Thanks, >> Thomas > > Looks awesome. Thanks @fisk for your review ------------- PR: https://git.openjdk.java.net/jdk/pull/7820 From tschatzl at openjdk.java.net Mon Mar 21 08:29:34 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 08:29:34 GMT Subject: Integrated: 8283186: Explicitly pass a third temp register to MacroAssembler::store_heap_oop In-Reply-To: References: Message-ID: On Tue, 15 Mar 2022 15:20:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that explicitly passes a third temp parameter to `MacroAssembler::store_heap_oop` so that `G1BarrierSetAssembler::oop_store_at` (and the equivalent Shenandoah code) does not need to invent some out of thin air? This makes the code much less surprising. > > The interesting part of this change is probably the first hunk in `src/hotspot/cpu/x86/templateTable_x86.cpp`, the rest is just passing on that additional parameter. > > Testing: gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: e709cb05 Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/e709cb05dcf67462f266c1f3dae30976b562676d Stats: 58 lines in 16 files changed: 2 ins; 2 del; 54 mod 8283186: Explicitly pass a third temp register to MacroAssembler::store_heap_oop Reviewed-by: eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/7820 From tschatzl at openjdk.java.net Mon Mar 21 08:42:25 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 08:42:25 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. > > Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. > > There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. > > It's based on `RegSet` from AArch64. > > Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: dean comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7867/files - new: https://git.openjdk.java.net/jdk/pull/7867/files/1be201b1..2d04da02 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7867&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7867&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7867.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7867/head:pull/7867 PR: https://git.openjdk.java.net/jdk/pull/7867 From aturbanov at openjdk.java.net Mon Mar 21 09:02:17 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Mon, 21 Mar 2022 09:02:17 GMT Subject: RFR: 8283426: Fix 'exeption' typo [v2] In-Reply-To: References: Message-ID: <1aOenacSYTb23hOLJFJBdM_k6vXL8gl5qxTn0RrocoA=.a0d88289-1e12-45f9-88ec-27981f9a0f74@github.com> > Fix repeated typo `exeption` Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: 8283426: Fix 'exeption' typo Apply suggestion Co-authored-by: Alexey Ivanov <70774172+aivanov-jdk at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7879/files - new: https://git.openjdk.java.net/jdk/pull/7879/files/d93dde25..4c1e68ed Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7879.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7879/head:pull/7879 PR: https://git.openjdk.java.net/jdk/pull/7879 From tschatzl at openjdk.java.net Mon Mar 21 09:25:34 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 09:25:34 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code [v2] In-Reply-To: <9tYjxw6pHU2D24S8ahHy6c8KFhn4kEnLOkNUaHzmwtk=.81b4fcce-20a9-4e03-a241-38d2b436bbbe@github.com> References: <9tYjxw6pHU2D24S8ahHy6c8KFhn4kEnLOkNUaHzmwtk=.81b4fcce-20a9-4e03-a241-38d2b436bbbe@github.com> Message-ID: On Fri, 18 Mar 2022 23:14:01 GMT, Dean Long wrote: > The interpreter probably doesn't need to preserve more than 1 floating pointer register. Do we care about optimizing for that case? I can look into this as an extra CR. Not sure if it will help a lot. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From jiefu at openjdk.java.net Mon Mar 21 09:42:34 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 21 Mar 2022 09:42:34 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag In-Reply-To: References: Message-ID: <4h_01oggjDJ3GwuylxPOIutd42dCjUGM2Ln-lczkctQ=.d983ab80-ff3c-4b19-abb9-dd5058c54acb@github.com> On Sat, 19 Mar 2022 00:28:42 GMT, Dean Long wrote: > Looks good to me. Thanks @dean-long . May I get a second review for this change? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7851 From mdoerr at openjdk.java.net Mon Mar 21 10:35:27 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 21 Mar 2022 10:35:27 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v14] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: <3EpEVlUVew6MfQf1GV_f2mWWlpXpPCNiTps13EtR1hI=.efe7c511-8a67-495f-9cb5-18c4d51075e4@github.com> On Tue, 15 Mar 2022 07:54:23 GMT, Johannes Bechberger wrote: >> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method >> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Add workaround comment Looks good except minor nits. src/hotspot/share/runtime/os.cpp line 1183: > 1181: // Looks like all platforms can use the same function to check if C > 1182: // stack is walkable beyond current frame. > 1183: // Returns false if this is the cas Rest of comment missing. test/hotspot/gtest/runtime/test_os.cpp line 871: > 869: > 870: TEST_VM(os, is_first_C_frame) { > 871: #ifndef _WIN32 Spaces before `#` should ideally be avoided: https://stackoverflow.com/questions/4721978/should-preprocessor-instructions-be-on-the-beginning-of-a-line (I guess not really a problem for the compilers we use.) ------------- Changes requested by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7591 From tschatzl at openjdk.java.net Mon Mar 21 10:46:35 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 10:46:35 GMT Subject: Integrated: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 14:00:34 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. > > Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. > > There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. > > It's based on `RegSet` from AArch64. > > Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). > > Thanks, > Thomas This pull request has now been integrated. Changeset: eb4849e5 Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/eb4849e5615dd307a5abc435a0204a6d26610fcb Stats: 546 lines in 9 files changed: 395 ins; 129 del; 22 mod 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code Reviewed-by: eosterlund, dlong ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From tschatzl at openjdk.java.net Mon Mar 21 10:46:34 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 10:46:34 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code [v2] In-Reply-To: <9tYjxw6pHU2D24S8ahHy6c8KFhn4kEnLOkNUaHzmwtk=.81b4fcce-20a9-4e03-a241-38d2b436bbbe@github.com> References: <9tYjxw6pHU2D24S8ahHy6c8KFhn4kEnLOkNUaHzmwtk=.81b4fcce-20a9-4e03-a241-38d2b436bbbe@github.com> Message-ID: On Fri, 18 Mar 2022 23:14:01 GMT, Dean Long wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> dean comments > > The interpreter probably doesn't need to preserve more than 1 floating pointer register. Do we care about optimizing for that case? Thanks @dean-long @fisk for your reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From johannes.bechberger at sap.com Mon Mar 21 11:42:10 2022 From: johannes.bechberger at sap.com (Bechberger, Johannes) Date: Mon, 21 Mar 2022 11:42:10 +0000 Subject: Proposal of a new version of AsyncGetCallTrace In-Reply-To: References: Message-ID: Hi David, the src/hotspot/share/prims/stackwalk.cpp has been made for another purpose: It is used in java.lang.StackWalker to obtain the Java frames and allocates memory on the heap. It is not used in the places that my proposed Stackwalker could be used: in profiling and error stack traces (hs_err file), where memory allocation might cause problems. The class in prims also lacks the ability to obtain the native frames. Best regards Johannes From: David Holmes Date: Sunday, 20. March 2022 at 23:41 To: Bechberger, Johannes , hotspot-dev at openjdk.java.net , hotspot-jfr-dev at openjdk.java.net , serviceability-dev at openjdk.java.net Subject: Re: Proposal of a new version of AsyncGetCallTrace Hi Johannes, On 18/03/2022 7:43 pm, Bechberger, Johannes wrote: > Hi, > > I would like propose to > > 1. Replace duplicated stack walking code with unified API > 2. Create a new version of AsyncGetCallTrace, tentatively called "AsyncGetCallTrace2", with more information on more frames using the unified API > > A demo (as well as this text) is available at https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fasgct2-demo&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=WzQw1M5pnBdK5PxtohnURFAJeSJOZy1ZAuRGSaczjOU%3D&reserved=0 > if you want to see a prototype of this proposal in action. > > Unify Stack Walking > ================ > > There are currently multiple implementations of stack walking in JFR and for AsyncGetCallTrace. > They each implement their own extension of vframeStream but with comparable features > and check for problematic frames. > > My proposal is, therefore, to replace the stack walking code with a unified API that > includes all error checking and vframeStream extensions in a single place. > The prosposed new class is called StackWalker and could be part of > `jfr/recorder/stacktrace` [1]. So we already have the StackWalker API provided at the Java level and with the implementation in the VM in src/hotspot/share/prims/stackwalk.cpp. How does that fit in with what you propose? Cheers, David ----- > This class also supports getting information on C frames so it can be potentially > used for walking stacks in VMError (used to create hs_err files), further > reducing the amount of different stack walking code. > > AsyncGetCallTrace2 > ================ > > The AsyncGetCallTrace call has seen increasing use in recent years > in profilers like async-profiler. > But it is not really an API (not exported in any header) and > the information on frames it returns is pretty limited > (only the method and bci for Java frames) which makes implementing > profilers and other tooling harder. Tools like async-profiler > have to resort to complicated code to partially obtain the information > that the JVM already has. > Information that is currently hidden and impossible to obtain is > > - whether a compiled frame is inlined (currently only obtainable for the topmost compiled frames) > - although this can be obtained using JFR > - C frames that are not at the top of the stack > - compilation level (C1 or C2 compiled) > > This information is helpful when profiling and tuning the VM for > a given application and also for profiling code that uses > JNI heavily. > > Using the proposed StackWalker class, implementing a new API > that returns more information on frames is possible > as a thin wrapper over the StackWalker API [2]. > This also improves the maintainability as the code used > in this API is used in multiple places and is therefore > also better tested than the previous implementation, see > [1] for the implementation. > > The following describes the proposed API: > > ```cpp > void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext); > ``` > > The structure of `CallTrace` is the same as the original > `ASGCT_CallTrace` with the same error codes encoded in <= 0 > values of `num_frames`. > > ```cpp > typedef struct { > JNIEnv *env_id; // Env where trace was recorded > jint num_frames; // number of frames in this trace > CallFrame *frames; // frames > void* frame_info; // more information on frames > } CallTrace; > ``` > > The only difference is that the `frames` array also contains > information on C frames and the field `frame_info`. > The `frame_info` is currently null and can later be used > for extended information on each frame, being an array with > an element for each frame. But the type of the > elements in this array is implementation specific. > This akin to `compile_info` field in JVMTI's CompiledMethodLoad > [3] and used for extending the information returned by the > API later. > > Protoype > ------------ > > Currently `CallFrame` is implemented in the prototype [4] as > > ```cpp > typedef struct { > void *machine_pc; // program counter, for C and native frames (frames of native methods) > uint8_t type; // frame type (single byte) > uint8_t comp_level; // highest compilation level of a method related to a Java frame > // information from original CallFrame > jint bci; // bci for Java frames > jmethodID method_id; // method ID for Java frames > } CallFrame; > ``` > > The `FrameTypeId` is based on the frame type in JFRStackFrame: > > ```cpp > enum FrameTypeId { > FRAME_INTERPRETED = 0, > FRAME_JIT = 1, // JIT compiled > FRAME_INLINE = 2, // inlined JITed methods > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_CPP = 4 // c/c++/... frames, stub frames have CompLevel_all > }; > ``` > > The `comp_level` states the compilation level of the method related to the frame > with higher numbers representing "more" compilation. `0` is defined as > interpreted. It is modeled after the `CompLevel` enum in `compiler/compilerDefinitions`: > > ```cpp > // Enumeration to distinguish tiers of compilation > enum CompLevel { > // ... > CompLevel_none = 0, // Interpreter > CompLevel_simple = 1, // C1 > CompLevel_limited_profile = 2, // C1, invocation & backedge counters > CompLevel_full_profile = 3, // C1, invocation & backedge counters + mdo > CompLevel_full_optimization = 4 // C2 or JVMCI > }; > ``` > > The traces produced by this prototype are fairly large > (each frame requires 24 is instead of 16 bytes on 64 bit systems) and some data is > duplicated. > The reason for this is that it simplified the extension of async-profiler > for the prototype, as it only extends the data structures of > the original AsyncGetCallTrace API without changing the original fields. > > Proposal > ------------ > > But packing the information and reducing duplication is of course possible > if we step away from the former constraint: > > ```cpp > enum FrameTypeId { > FRAME_JAVA = 1, // JIT compiled and interpreted > FRAME_JAVA_INLINED = 2, // inlined JIT compiled > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_STUB = 4, // VM generated stubs > FRAME_CPP = 5 // C/C++/... frames > }; > > typedef struct { > uint8_t type; // frame type > uint8_t comp_level; > uint16_t bci; // 0 < bci < 65536 > jmethodID method_id; > } JavaFrame; // used for FRAME_JAVA and FRAME_JAVA_INLINED > > typedef struct { > FrameTypeId type; // single byte type > void *machine_pc; > } NonJavaFrame; // used for FRAME_NATIVE, FRAME_STUB and FRAME_CPP > > typedef union { > FrameTypeId type; // to distinguish between JavaFrame and NonJavaFrame > JavaFrame java_frame; > NonJavaFrame non_java_frame; > } CallFrame; > ``` > > This uses the same amount of space per frame (16 bytes) as the original but encodes far more information. > > Best regards > Johannes > > [1] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fjdk%2Fblob%2Fparttimenerd_asgct2%2Fsrc%2Fhotspot%2Fshare%2Fjfr%2Frecorder%2Fstacktrace%2FstackWalker.hpp&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=u8imCsxZLO5qHcB3Kdn%2FiC5FQ2NGwxJzBocy2QGKngM%3D&reserved=0 > > [2] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fjdk%2Fblob%2Fparttimenerd_asgct2%2Fsrc%2Fhotspot%2Fshare%2Fprims%2Fasgct2.cpp****&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=CVP%2BZFYiz%2Brmvn4RSFmYxttL6SttV9AxSLgWnGoBbXY%3D&reserved=0 > > [3] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fplatform%2Fjvmti%2Fjvmti.html%23CompiledMethodLoad&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=UmILa1T1BI4SKK3k%2BTYnCLE7dTwSL6jmUVvTMQIdA3E%3D&reserved=0 > > [4] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fjdk%2Fblob%2Fparttimenerd_asgct2%2Fsrc%2Fhotspot%2Fshare%2Fprims%2Fasgct2.hpp&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ROyKJ2ADjrQvVIoLBxwBdKm3AWAdQ2Gk4m6f7He5mNI%3D&reserved=0 > From duke at openjdk.java.net Mon Mar 21 12:13:14 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 21 Mar 2022 12:13:14 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v15] In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: > This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method > and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix minor style issues ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7591/files - new: https://git.openjdk.java.net/jdk/pull/7591/files/7e8aff34..5cb56687 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=14 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=13-14 Stats: 4 lines in 2 files changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7591.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591 PR: https://git.openjdk.java.net/jdk/pull/7591 From akozlov at openjdk.java.net Mon Mar 21 12:29:47 2022 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 21 Mar 2022 12:29:47 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> Message-ID: On Sun, 20 Mar 2022 22:49:49 GMT, David Holmes wrote: > My comment was specifically in response to your statement: > > I see AsyncGetCallTrace to assume the JavaThread very soon > > But AGCT is only intended to ever be called on JavaThreads. Sorry, it was my question. It looked for me this way as well (and that ACGT will return shortly if called on non-Java thread; AFAICS SafeFetch in not involved), and I wanted to confirm. The AGCT on non-Java thread was declared to be one of the two major reasons for this patch. I would support this patch to move W^X management out from Thread to OS-specific code, after the problem with the assert "is initialized" is fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Mon Mar 21 13:21:40 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 21 Mar 2022 13:21:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Fri, 11 Mar 2022 07:52:16 GMT, Johannes Bechberger wrote: >> The WXMode for the current thread (on MacOS aarch64) is currently stored in the thread class which is unnecessary as the WXMode is bound to the current OS thread, not the current instance of the thread class. >> This pull request moves the storage of the current WXMode into a thread local global variable in `os` and changes all related code. SafeFetch depended on the existence of a thread object only because of the WXMode. This pull request therefore removes the dependency, making SafeFetch usable in more contexts. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Remove two unnecessary lines I'm currently implementing Andrews proposal for a static safefetch (https://github.com/openjdk/jdk/pull/7865, still in draft, but almost done). That will be more generic solution since we don't have to deal with thread wx state at all. That's why we closed this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From dholmes at openjdk.java.net Mon Mar 21 13:21:40 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Mar 2022 13:21:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Mon, 21 Mar 2022 12:44:58 GMT, Thomas Stuefe wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove two unnecessary lines > > I'm currently implementing Andrews proposal for a static safefetch (https://github.com/openjdk/jdk/pull/7865, still in draft, but almost done). That will be more generic solution since we don't have to deal with thread wx state at all. That's why we closed this PR. The conversation here is some what hard to follow. I do see that "foreign threads" was mentioned by @tstuefe in the context of AGCT but I have to assume he misspoke there (assuming a foreign thread is one not attached to the VM) as AGCT only works for attached JavaThreads. The signal handler that will call AGCT has to be prepared to find any kind of thread in any state, but AGCT should only be called on the right kinds of thread in the right state. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From stuefe at openjdk.java.net Mon Mar 21 13:21:40 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 21 Mar 2022 13:21:40 GMT Subject: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v7] In-Reply-To: References: <_veS70i6iCqTIbTDDfwKCS_R-7u6Ri6vgqtqvBfD00k=.071c7a06-74d0-44da-95a6-ceba8a2037aa@github.com> <7cc_n9FTme_L52e9GrtEJyUHemM5GH5LdMSRcwgTGws=.bd6bb1c4-ca8b-4fdc-8ce4-7a61ec315ec3@github.com> Message-ID: On Mon, 21 Mar 2022 12:44:58 GMT, Thomas Stuefe wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove two unnecessary lines > > I'm currently implementing Andrews proposal for a static safefetch (https://github.com/openjdk/jdk/pull/7865, still in draft, but almost done). That will be more generic solution since we don't have to deal with thread wx state at all. That's why we closed this PR. > The conversation here is some what hard to follow. I do see that "foreign threads" was mentioned by @tstuefe in the context of AGCT but I have to assume he misspoke there (assuming a foreign thread is one not attached to the VM) as AGCT only works for attached JavaThreads. The signal handler that will call AGCT has to be prepared to find any kind of thread in any state, but AGCT should only be called on the right kinds of thread in the right state. Sure, AGCT can be limited to VM threads - or maybe already is. But tracking non-VM threads could be a valid use case. We have downstream in the SapMachine a facility where we track callstacks from malloc sites - independently from NMT or the VM. With the explicit purpose of catching mallocs from non-VM threads too. For collecting the stack trace, we use some VM utilities, SafeFetch among them. That is a very useful facility. I could argue a similar case for the Async Profiler: why should profiling be limited to Java threads? In the end, if it eats performance, it hurts, regardless whether its a java thread or a non-VM-attached thread. Could be a concurrent native thread burning CPU, why would that not be interesting. Our concern was with SafeFetch, and AGCT is only one example. SafeFetch should be as safe as possible. Error reporting alone is a sufficient reason. ------------- PR: https://git.openjdk.java.net/jdk/pull/7727 From aivanov at openjdk.java.net Mon Mar 21 13:23:30 2022 From: aivanov at openjdk.java.net (Alexey Ivanov) Date: Mon, 21 Mar 2022 13:23:30 GMT Subject: RFR: 8283426: Fix 'exeption' typo [v2] In-Reply-To: <1aOenacSYTb23hOLJFJBdM_k6vXL8gl5qxTn0RrocoA=.a0d88289-1e12-45f9-88ec-27981f9a0f74@github.com> References: <1aOenacSYTb23hOLJFJBdM_k6vXL8gl5qxTn0RrocoA=.a0d88289-1e12-45f9-88ec-27981f9a0f74@github.com> Message-ID: On Mon, 21 Mar 2022 09:02:17 GMT, Andrey Turbanov wrote: >> Fix repeated typo `exeption` > > Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: > > 8283426: Fix 'exeption' typo > > Apply suggestion > > Co-authored-by: Alexey Ivanov <70774172+aivanov-jdk at users.noreply.github.com> Marked as reviewed by aivanov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7879 From mdoerr at openjdk.java.net Mon Mar 21 13:39:34 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 21 Mar 2022 13:39:34 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v15] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: On Mon, 21 Mar 2022 12:13:14 GMT, Johannes Bechberger wrote: >> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method >> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix minor style issues LGTM. Thanks! ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7591 From zgu at openjdk.java.net Mon Mar 21 13:56:01 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 21 Mar 2022 13:56:01 GMT Subject: RFR: 8283456: Make CompiledICHolder::live_count/live_not_claimed_count debug only Message-ID: Please review this trivial patch to make `CompiledICHolder::live_count/live_not_claimed_count` debug only, since they are only updated/used in debug only code. ------------- Commit messages: - 8283456: Make CompiledICHolder::live_count/live_not_claimed_count debug only Changes: https://git.openjdk.java.net/jdk/pull/7890/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7890&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283456 Stats: 8 lines in 2 files changed: 5 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7890.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7890/head:pull/7890 PR: https://git.openjdk.java.net/jdk/pull/7890 From ihse at openjdk.java.net Mon Mar 21 14:43:33 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 21 Mar 2022 14:43:33 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port In-Reply-To: References: Message-ID: On Mon, 8 Nov 2021 11:17:47 GMT, Fei Yang wrote: > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 Build changes look good. I can't say anything about the rest of the code. ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6294 From jiefu at openjdk.java.net Mon Mar 21 14:48:37 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 21 Mar 2022 14:48:37 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code [v2] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 08:42:25 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that adds an API to save/restore caller-saved registers for VM upcalls for C1/interpreter. >> >> Currently, for x86, this is done in a very ad-hoc (copy&pasty) way, which starts to fall apart. Additionally this fixes some problems with wrong stack alignment. >> >> There is some cleanup to do separately to remove that copy&paste code in another CR. At the moment the API (`push_call_clobbered_registers/pop_call_clobbered_registers`) is only used for g1. >> >> It's based on `RegSet` from AArch64. >> >> Testing: tier1-5, tier1 testing with x64 and x86, with some at this point obscure combinations (like x86 UseSSE=0/1). >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > dean comments src/hotspot/cpu/x86/macroAssembler_x86.cpp line 29: > 27: #include "asm/assembler.hpp" > 28: #include "asm/assembler.inline.hpp" > 29: #include "c1/c1_FrameMap.hpp" Hi @tschatzl , this line breaks the build of VM with `--with-jvm-features=-compiler1`. Please have a look. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From tschatzl at openjdk.java.net Mon Mar 21 15:34:39 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 21 Mar 2022 15:34:39 GMT Subject: RFR: 8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code [v2] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 14:44:53 GMT, Jie Fu wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> dean comments > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 29: > >> 27: #include "asm/assembler.hpp" >> 28: #include "asm/assembler.inline.hpp" >> 29: #include "c1/c1_FrameMap.hpp" > > Hi @tschatzl , this line breaks the build of VM with `--with-jvm-features=-compiler1`. > Please have a look. > Thanks. will do. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7867 From duke at openjdk.java.net Mon Mar 21 15:53:40 2022 From: duke at openjdk.java.net (Johannes Bechberger) Date: Mon, 21 Mar 2022 15:53:40 GMT Subject: Integrated: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: On Wed, 23 Feb 2022 14:59:49 GMT, Johannes Bechberger wrote: > This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method > and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. This pull request has now been integrated. Changeset: 999da9bf Author: Johannes Bechberger Committer: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/999da9bfc5be703141cdc07af455b4b6b2cc1aae Stats: 80 lines in 11 files changed: 60 ins; 14 del; 6 mod 8282306: os::is_first_C_frame(frame*) crashes on invalid link access Reviewed-by: stuefe, mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From mcimadamore at openjdk.java.net Mon Mar 21 16:36:42 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 21 Mar 2022 16:36:42 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) Message-ID: This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. [1] - https://openjdk.java.net/jeps/424 ------------- Commit messages: - Fix TestLayouts on 32-bits - Update copyright - Drop unused imports in Reflection.java - Fix writeOversized for booleans - Revert changes to scopedMemoryAccess.cpp - Add TestUpcallStack to problem list - Revert changes to problem list - Revert to previous handshake logic - Initial push Changes: https://git.openjdk.java.net/jdk/pull/7888/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282191 Stats: 66428 lines in 364 files changed: 44012 ins; 19911 del; 2505 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Mon Mar 21 16:36:42 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 21 Mar 2022 16:36:42 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 10:45:27 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Here is a list of the main changes in this iteration. #### java.lang.foreign The API is now a **preview** API in `java.lang.foreign`. As such to be able to use the API, users must pass the `--enable-preview` flag accordingly, to `javac` and `java`. Since the API now lives in `java.base`, we dropped the `MemoryHandles` class and moved all its var handle combinator methods under `MethodHandles`. We have also dropped the `MemorySegment::map` factory and replaced it with a new overload in `FileChannel`, which plays much better with custom file systems. #### ResourceScope The `ResourceScope` abstraction has been renamed to `MemorySession`. Aside from the naming difference (which also is reflected in some of the factories associated with `MemorySession`, another difference are that `MemorySession` now implements `SegmentAllocator` and can be used straight away to allocate segments. Finally, the fact that some sessions are not closeable is now reflected in the API (see `MemorySession::isCloseable`), and a method has been added to create a non-closeable *view* of the memory session. #### Restricted methods Addressing the feedback we have received during incubation, the mechanism to control access to restricted methods is now more permissive. Users can still use the `--enable-native-access` flag, to get a strict, opt-in behavior, in case they want to control which modules can access restricted methods in the foreign API. But if no flag is specified, access is allowed, with a runtime warning. Supporting this required some changes in `ModuleBootstrap` so that we could record the fact that we have seen an `--enable-native-access` flag (so that all checks in `Reflection.java` becomes constant). #### Deterministic library loading/unloading We have enhanced the `SymbolLookup` abstraction to provide a new symbol lookup, called *library lookup*. A library lookup is a symbol lookup built around a specific native library which is loaded/unloaded using dlopen/LoadLibrary. Library lookups are associated with memory sessions, so the library can be unloaded deterministically when the session is closed. #### Memory layouts All memory layout constants feature the expected alignment constraints. For instance, `JAVA_CHAR` is 2 byte aligned, `JAVA_INT` is 4 byte aligned, and so on. #### Removed functionalities As we moved the API in `java.base` we have dropped a number of API points which did not seem to add much value, based on the feedback received: * `SequenceLayout`s now always have a bounded size. As a result, `MemoryLayout::byteSize` no longer returns an optional. A zero-length sequence can be used instead; * `NativeSymbol` has been dropped. At the end of the day, its role is undistinguishable from that of a memory segment with zero-length, so API (and users) should use zero-length segments instead; * `MemorySession::keepAlive` - has been removed; in its place we have a simple, high-order method which executes an action (a `Runnable`) while keeping the session alive (`MemorySession::whileAlive`); * `MemoryLayout::map` only provides limited capabilities when rewriting layouts (e.g. it can only replace one layout at a time); as such we removed this API, and we might add something better at a later point. #### Hotspot changes While the Panama foreign repo contains several Hotspot changes which simplify and regularize the foreign function support, these changes are not included here, as we have discovered some intermittent instability in macosx-aarch64. We might attempt to integrate the hotspot changes at a later date. Javadoc: http://cr.openjdk.java.net/~mcimadamore/8282191/v1/javadoc/java.base/module-summary.html Specdiff: http://cr.openjdk.java.net/~mcimadamore/8282191/v1/specdiff_out/overview-summary.html ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From jboes at openjdk.java.net Mon Mar 21 16:36:45 2022 From: jboes at openjdk.java.net (Julia Boes) Date: Mon, 21 Mar 2022 16:36:45 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <1c6iI0BzRJEWKr5IDBEaZfw6CyXoJw1N5G54lDjpGYc=.52e24356-0cf0-4939-8a7f-776800511b51@github.com> On Mon, 21 Mar 2022 10:45:27 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 600: > 598: * @param elementLayout the source element layout. If the byte order associated with the layout is > 599: * different from the native order, a byte swap operation will be performed on each array element. > 600: * @return a fresh short array copy of this memory segment. Maybe use "new" instead of "fresh" here and in the other MemorySegment::toArray methods? src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 600: > 598: * @param elementLayout the source element layout. If the byte order associated with the layout is > 599: * different from the native order, a byte swap operation will be performed on each array element. > 600: * @return a fresh short array copy of this memory segment. Maybe use "new" instead of "fresh" here and in the other MemorySegment::toArray methods? src/java.base/share/classes/java/lang/foreign/package-info.java line 149: > 147: * provided: > 148: * > 149: * {@snippet lang=java : missing leading space in comment on line 150 and 162 src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 7986: > 7984: *

> 7985: * When calling e.g. {@link VarHandle#get(Object...)} on the resulting var handle, the incoming coordinate values > 7986: * starting at position {@code pos} (of type {@code C1, C2 ... Cn}, where {@code C1, C2 ... Cn} are the return type ... are the return *types* ... src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 7986: > 7984: *

> 7985: * When calling e.g. {@link VarHandle#get(Object...)} on the resulting var handle, the incoming coordinate values > 7986: * starting at position {@code pos} (of type {@code C1, C2 ... Cn}, where {@code C1, C2 ... Cn} are the return type ... are the return *types* ... src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 8035: > 8033: * @param pos the position of the first coordinate to be inserted > 8034: * @param values the series of bound coordinates to insert > 8035: * @return an adapter var handle which inserts an additional coordinates, ... which inserts additional coordinates, ... src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 8035: > 8033: * @param pos the position of the first coordinate to be inserted > 8034: * @param values the series of bound coordinates to insert > 8035: * @return an adapter var handle which inserts an additional coordinates, ... which inserts additional coordinates, ... src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 8151: > 8149: * > 8150: * @param target the var handle to invoke after the dummy coordinates are dropped > 8151: * @param pos position of first coordinate to drop (zero for the leftmost) ... of *the* first coordinate to drop ... src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 8151: > 8149: * > 8150: * @param target the var handle to invoke after the dummy coordinates are dropped > 8151: * @param pos position of first coordinate to drop (zero for the leftmost) ... of *the* first coordinate to drop ... ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From erikj at openjdk.java.net Mon Mar 21 17:24:35 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 21 Mar 2022 17:24:35 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 10:45:27 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Build changes look ok. make/modules/java.base/Lib.gmk line 217: > 215: CXXFLAGS := $(CXXFLAGS_JDKLIB), \ > 216: LDFLAGS := $(LDFLAGS_JDKLIB) -Wl$(COMMA)--no-as-needed, \ > 217: LIBS := $(LIBCXX) -lc -lm -ldl, \ Instead of repeating this whole macro call for both Linux and non Linux, you can use parameters of the form LDFLAGS_linux and LIBS_linux to add the Linux specific flags. Something like this: LDFLAGS := $(LDFLAGS_JDKLIB), \ LDFLAGS_linux := -Wl$(COMMA)--no-as-needed, \ For the NAME field, there is no such support, so the way we usually do that is through a variable and conditionals before the macro call. What's the reason to have a different lib name on Windows? If they were the same, and the source file in windows/native/... had the same name, it would just automatically override in the build. I realize now that this is just moved code from jdk.incubator.foreign, and this patch is probably big enough as it is so no need to refactor the build logic at the same time. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Mon Mar 21 17:40:35 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 21 Mar 2022 17:40:35 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 17:16:49 GMT, Erik Joelsson wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > make/modules/java.base/Lib.gmk line 217: > >> 215: CXXFLAGS := $(CXXFLAGS_JDKLIB), \ >> 216: LDFLAGS := $(LDFLAGS_JDKLIB) -Wl$(COMMA)--no-as-needed, \ >> 217: LIBS := $(LIBCXX) -lc -lm -ldl, \ > > Instead of repeating this whole macro call for both Linux and non Linux, you can use parameters of the form LDFLAGS_linux and LIBS_linux to add the Linux specific flags. Something like this: > > > LDFLAGS := $(LDFLAGS_JDKLIB), \ > LDFLAGS_linux := -Wl$(COMMA)--no-as-needed, \ > > > For the NAME field, there is no such support, so the way we usually do that is through a variable and conditionals before the macro call. What's the reason to have a different lib name on Windows? If they were the same, and the source file in windows/native/... had the same name, it would just automatically override in the build. > > I realize now that this is just moved code from jdk.incubator.foreign, and this patch is probably big enough as it is so no need to refactor the build logic at the same time. Good points - there is really no need AFAIK for the lib name to be different. I'll do few experiments. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From duke at openjdk.java.net Mon Mar 21 17:59:44 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Mon, 21 Mar 2022 17:59:44 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Sun, 13 Mar 2022 04:27:44 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4178: >> >>> 4176: movl(scratch, 1056964608); >>> 4177: movq(xtmp1, scratch); >>> 4178: vbroadcastss(xtmp1, xtmp1, vec_enc); >> >> You could put the constant in the constant table and use `vbroadcastss` here also. >> >> Thank you very much. > > constant and register to register moves are never issued to execution ports, rematerializing value rather than reading from memory will give better performance. I have come across this a little bit. While `movl r, i` may not consume execution ports, `movq x, r` and `vbroadcastss x, x` surely do. This leads to 3 retired and 2 executed uops. Furthermore, both `movq x, r` and `vbroadcastss x, x` can only run on port 5, limit the throughput of the operation. On the contrary, a `vbroadcastss x, m` only results in 1 retired and 1 executed uop, reducing pressure on the decoder and the backend. A `vbroadcastss x, m` can run on both port 2 and port 3, offering a much better throughput. Latency is not much of a concern in this circumstance since the operation does not have any input dependency. > register to register moves are never issued to execution ports I believe you misremembered this part, a register to register move is only elided when the registers are of the same kind, `vmovq x, r` would result in 1 uop being executed on port 5. What do you think? Thank you very much. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Mon Mar 21 18:28:34 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 21 Mar 2022 18:28:34 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 17:56:22 GMT, Quan Anh Mai wrote: >> constant and register to register moves are never issued to execution ports, rematerializing value rather than reading from memory will give better performance. > > I have come across this a little bit. While `movl r, i` may not consume execution ports, `movq x, r` and `vbroadcastss x, x` surely do. This leads to 3 retired and 2 executed uops. Furthermore, both `movq x, r` and `vbroadcastss x, x` can only run on port 5, limit the throughput of the operation. On the contrary, a `vbroadcastss x, m` only results in 1 retired and 1 executed uop, reducing pressure on the decoder and the backend. A `vbroadcastss x, m` can run on both port 2 and port 3, offering a much better throughput. Latency is not much of a concern in this circumstance since the operation does not have any input dependency. > >> register to register moves are never issued to execution ports > > I believe you misremembered this part, a register to register move is only elided when the registers are of the same kind, `vmovq x, r` would result in 1 uop being executed on port 5. > > What do you think? Thank you very much. A read from constant table will incur minimum of L1I access penalty to access code blob or at worst even more if data is not present in first level cache. Change was done for replace vpbroadcastd with vbroadcastss because of two reasons. 1) vbroadcastss works at AVX=1 level where as vpbroadcastd need AVX2 feature. 2) We can avoid extra cycle penalty due to two domain switchovers (FP -> INT and then from INT-> FP). ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From vladimir.kozlov at oracle.com Mon Mar 21 18:46:23 2022 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 21 Mar 2022 11:46:23 -0700 Subject: CFV: New HotSpot Group Member: Dean Long In-Reply-To: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> References: <26917f30-c564-8840-abc2-3222e9cae56d@oracle.com> Message-ID: <5f024020-91a2-e76d-0ba4-73959fd5dc1a@oracle.com> Vote: yes Thanks, Vladimir K On 3/16/22 1:29 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Dean Long (dlong) to Membership in the HotSpot Group. > > Dean is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 130 changes to the JDK project [1]. After significant contributions to > Ahead-of-Time Compilation (JEP 295), including work on JVMCI and Graal, Dean recently worked on > improving compilation replay (JDK-8254106). Dean is also part of our triaging team, making sure that > all incoming compiler bugs are properly handled. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Dean+Long%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From vladimir.kozlov at oracle.com Mon Mar 21 18:47:05 2022 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 21 Mar 2022 11:47:05 -0700 Subject: CFV: New HotSpot Group Member: Ivan Walulya In-Reply-To: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> References: <6F9E0D75-8035-464F-BCF7-73CD8DC8F6CE@oracle.com> Message-ID: <36fb5da7-d715-4b16-2260-a87c07f0793c@oracle.com> Vote: yes Thanks, Vladimir K On 3/16/22 5:46 AM, Kim Barrett wrote: > hotspot-dev at openjdk.java.net > CFV: New HotSpot Group Member: Ivan Walulya > > I hereby nominate Ivan Walulya to Membership in the HotSpot Group. > > Ivan is a JDK Reviewer and a member of the Oracle GC team, primarily working > on G1. He has made many substantial contributions [1] including co-authoring a > major rewrite of G1's remembered sets. He is also a frequent and thorough > reviewer (as I well know). > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Ivan+Walulya%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From vladimir.kozlov at oracle.com Mon Mar 21 18:47:39 2022 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 21 Mar 2022 11:47:39 -0700 Subject: CFV: New HotSpot Group Member: Leo Korinth In-Reply-To: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> References: <9CD2C35A-F134-4372-B110-05E4878796A7@oracle.com> Message-ID: <8ed9c4d9-94be-756d-1755-fc158439ecde@oracle.com> Vote: yes Thanks, Vladimir K On 3/16/22 5:47 AM, Kim Barrett wrote: > I hereby nominate Leo Korinth to Membership in the HotSpot Group. > > Leo is a JDK Reviewer and a member of the Oracle GC team, primarily working on > G1. He has made many substantial contributions [1] including several > refactorings in ParallelGC to bring it in-line with other collectors. He also > dealt with the main removal of CMS and a number of related cleanups; CMS > tendrils extended far and deep. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Leo+Korinth%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From vladimir.kozlov at oracle.com Mon Mar 21 18:48:12 2022 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 21 Mar 2022 11:48:12 -0700 Subject: CFV: New HotSpot Group Member: Sangheon Kim In-Reply-To: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> References: <88B9E297-97E8-4501-889E-077FA8A2233B@oracle.com> Message-ID: <9a22a88f-3419-007d-5156-6a15c596e9af@oracle.com> Vote: yes Thanks, Vladimir K On 3/16/22 5:40 AM, Kim Barrett wrote: > I hereby nominate Sangheon Kim to Membership in the HotSpot Group. > > Sangheon has been a JDK Reviewer and member of the Oracle GC team for > many years, primarily working on G1. He has made many substantial > contributions [1] including to NUMA support and improving GC thread > configuration. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Sangheon+Kim%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From vladimir.kozlov at oracle.com Mon Mar 21 18:48:46 2022 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 21 Mar 2022 11:48:46 -0700 Subject: CFV: New HotSpot Group Member: Vladimir Ivanov In-Reply-To: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> References: <8202373d-2e45-c8e4-e5e0-f8f002cd189a@oracle.com> Message-ID: Vote: yes Thanks, Vladimir K On 3/16/22 1:29 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Vladimir Ivanov (vlivanov) to Membership in the HotSpot Group. > > Vladimir is a long standing member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. Since > 2012, he contributed over 360 changes to the JDK project [1]. Vladimir worked on some of our most > challenging projects including VM support for Project Lambda, JSR-292 and LambdaForm reduction and > caching. He is currently deeply involved in Project Panama, working on the Foreign Function > Interface and the Vector API. > > Votes are due by Wednesday, 30 March 2022 at 09:00 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?o=desc&p=37&q=committer-name%3A%22Vladimir+Ivanov%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&s=committer-date&type=Commits > [2] https://openjdk.java.net/census#hotspot > [3] https://openjdk.java.net/groups/#member-vote From jvernee at openjdk.java.net Mon Mar 21 19:29:31 2022 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Mon, 21 Mar 2022 19:29:31 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 10:45:27 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 src/java.base/share/classes/java/lang/foreign/CLinker.java line 176: > 174: * @param symbol the address of the target function. > 175: * @param function the function descriptor of the target function. > 176: * @return a new downcall method handle. The method handle type is inferred Maybe drop the "new" from this. Since we want to do caching in the future. Suggestion: * @return a downcall method handle. The method handle type is inferred src/java.base/share/classes/java/lang/foreign/CLinker.java line 199: > 197: * > 198: * @param function the function descriptor of the target function. > 199: * @return a new downcall method handle. The method handle type is inferred Suggestion: * @return a downcall method handle. The method handle type is inferred src/java.base/share/classes/java/lang/foreign/MemoryAddress.java line 159: > 157: * Creates a memory address from the given long value. > 158: * @param value the long value representing a raw address. > 159: * @return a new memory address instance. Similar here. A new address is not always returned. Suggestion: * @return a memory address instance. src/java.base/share/classes/java/lang/foreign/package-info.java line 230: > 228: * where {@code M1}, {@code M2}, {@code ... Mn} are module names (for the unnamed module, the special value {@code ALL-UNNAMED} > 229: * can be used). If this option is specified, access to restricted methods is only granted to the modules listed by that > 230: * option. If this option is not specified, access to restricted method is enabled for all modules, but Suggestion: * option. If this option is not specified, access to restricted methods is enabled for all modules, but src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/CallArranger.java line 53: > (failed to retrieve contents of file, check the PR for context) Keeping this static import would seem more readable here, instead of prefixing everything with `AArch64Architecture.`. (especially in the ABI definition below) src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/CallArranger.java line 53: > (failed to retrieve contents of file, check the PR for context) Same here, I think keeping the static import for this would make things more readable. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Mon Mar 21 21:25:31 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Mon, 21 Mar 2022 21:25:31 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 14:17:21 GMT, Jorn Vernee wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > src/java.base/share/classes/jdk/internal/foreign/abi/x64/sysv/CallArranger.java line 53: > >> (failed to retrieve contents of file, check the PR for context) > Same here, I think keeping the static import for this would make things more readable. Good catch. I think the IDE did that when I moved files across; I've fixed few of these, but there's more it seems. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From kvn at openjdk.java.net Tue Mar 22 00:35:28 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 22 Mar 2022 00:35:28 GMT Subject: RFR: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local [v4] In-Reply-To: References: Message-ID: On Fri, 11 Mar 2022 06:35:31 GMT, David Holmes wrote: >> Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. >> >> This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Additional tweaks requested by @kbarrett Approved. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Tue Mar 22 01:16:31 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 01:16:31 GMT Subject: Integrated: 8282721: HotSpot Style Guide should allow considered use of C++ thread_local In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:34:20 GMT, David Holmes wrote: > Style guide changes to support JDK-8282469 (PR https://github.com/openjdk/jdk/pull/7719). We no longer prohibit use of C++ `thread_local`, but allow it when there is an essential, and considered, need. > > This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. This pull request has now been integrated. Changeset: f3dc0c88 Author: David Holmes URL: https://git.openjdk.java.net/jdk/commit/f3dc0c88ea00a3745f5f105404e0788a0f616407 Stats: 24 lines in 2 files changed: 8 ins; 0 del; 16 mod 8282721: HotSpot Style Guide should allow considered use of C++ thread_local Reviewed-by: kbarrett, jrose, dcubed, stuefe, mdoerr, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/7720 From dholmes at openjdk.java.net Tue Mar 22 01:24:42 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 01:24:42 GMT Subject: RFR: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:12:03 GMT, David Holmes wrote: > This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. > > There are three parts: > > 1. Add the new #define for `thread_local` > 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. > 3. Remove the prohibition on using `thread_local` from the hotspot style guide > > Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. > > Testing: > - manual testing of the Panama usecase as referenced in the JBS issue > - Tiers 1-3 > > Thanks, > David Style guide update has been approved and integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From dholmes at openjdk.java.net Tue Mar 22 01:24:42 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 01:24:42 GMT Subject: Integrated: 8282469: Allow considered use of C++ thread_local in Hotspot In-Reply-To: References: Message-ID: On Mon, 7 Mar 2022 06:12:03 GMT, David Holmes wrote: > This patch provides a means for using C++ `thread_local` when it is essential - see JBS for more details. > > There are three parts: > > 1. Add the new #define for `thread_local` > 2. Remove `operator_new.cpp` as use of C++ `thread_local` with a non-trival cleanup actions requires use of global operators new/delete. These are still excluded for hotspot use via a link-time check. > 3. Remove the prohibition on using `thread_local` from the hotspot style guide > > Due to the way hotspot style guide changes must be done, part 3 is being done under a sub-task in PR https://github.com/openjdk/jdk/pull/7720 and the two PR's will integrate at the same time. > > Testing: > - manual testing of the Panama usecase as referenced in the JBS issue > - Tiers 1-3 > > Thanks, > David This pull request has now been integrated. Changeset: 81d63734 Author: David Holmes URL: https://git.openjdk.java.net/jdk/commit/81d63734bc2e2a18063cb6afbc53f8813a0ba880 Stats: 104 lines in 2 files changed: 4 ins; 100 del; 0 mod 8282469: Allow considered use of C++ thread_local in Hotspot Reviewed-by: kbarrett, dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/7719 From duke at openjdk.java.net Tue Mar 22 01:58:40 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 22 Mar 2022 01:58:40 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 18:25:36 GMT, Jatin Bhateja wrote: > A read from constant table will incur minimum of L1I access penalty to access code blob or at worst even more if data is not present in first level cache But your approach comes at a cost of frontend bandwidth and port contention, which imo are more important than latency in this case since a constant load does not prolong dependency chains. A load has very good throughput so it is often performant unless the load depends on its input (the memory location or the registers used for address calculation). Thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dholmes at openjdk.java.net Tue Mar 22 01:58:44 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 01:58:44 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v15] In-Reply-To: References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> Message-ID: <_FBJRWSOz1KOMVZYU0IvnWguPtDf0Gf5WSOrPjTUj-g=.af525521-8e54-4660-b2e6-fbd57a183375@github.com> On Mon, 21 Mar 2022 12:13:14 GMT, Johannes Bechberger wrote: >> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method >> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix minor style issues src/hotspot/share/runtime/os.cpp line 1183: > 1181: // Looks like all platforms can use the same function to check if C > 1182: // stack is walkable beyond current frame. > 1183: // Returns true if this is not the case, i.e. the frame is possibly This comment sounds wrong. Surely we return true if it is the case that the given frame is the first C frame on the stack? ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From dholmes at openjdk.java.net Tue Mar 22 02:09:33 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 02:09:33 GMT Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v15] In-Reply-To: <_FBJRWSOz1KOMVZYU0IvnWguPtDf0Gf5WSOrPjTUj-g=.af525521-8e54-4660-b2e6-fbd57a183375@github.com> References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com> <_FBJRWSOz1KOMVZYU0IvnWguPtDf0Gf5WSOrPjTUj-g=.af525521-8e54-4660-b2e6-fbd57a183375@github.com> Message-ID: On Tue, 22 Mar 2022 01:55:22 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix minor style issues > > src/hotspot/share/runtime/os.cpp line 1183: > >> 1181: // Looks like all platforms can use the same function to check if C >> 1182: // stack is walkable beyond current frame. >> 1183: // Returns true if this is not the case, i.e. the frame is possibly > > This comment sounds wrong. Surely we return true if it is the case that the given frame is the first C frame on the stack? Never mind I see the full context now. Would have been better to not start this comment on a new line in the original code. ------------- PR: https://git.openjdk.java.net/jdk/pull/7591 From jbhateja at openjdk.java.net Tue Mar 22 02:55:32 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 22 Mar 2022 02:55:32 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 01:55:38 GMT, Quan Anh Mai wrote: >> A read from constant table will incur minimum of L1I access penalty to access code blob or at worst even more if data is not present in first level cache. Change was done for replace vpbroadcastd with vbroadcastss because of two reasons. >> 1) vbroadcastss works at AVX=1 level where as vpbroadcastd need AVX2 feature. >> 2) We can avoid extra cycle penalty due to two domain switchovers (FP -> INT and then from INT-> FP). > >> A read from constant table will incur minimum of L1I access penalty to access code blob or at worst even more if data is not present in first level cache > > But your approach comes at a cost of frontend bandwidth and port contention, which imo are more important than latency in this case since a constant load does not prolong dependency chains. A load has very good throughput so it is often performant unless the load depends on its input (the memory location or the registers used for address calculation). Thanks Thanks for going into details, multicycle memory load will also defer dispatch of dependent instructions to execution port, port congestion becomes bottleneck when multiple ready instructions cannot be issued due to lack of execution resource or throughput constraints imposed by instruction, but a single cycle dependency chain may still win over latency due to pending memory operations. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dholmes at openjdk.java.net Tue Mar 22 02:58:38 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 02:58:38 GMT Subject: RFR: 8283456: Make CompiledICHolder::live_count/live_not_claimed_count debug only In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 13:47:59 GMT, Zhengyu Gu wrote: > Please review this trivial patch to make `CompiledICHolder::live_count/live_not_claimed_count` debug only, since they are only updated/used in debug only code. Seems fine and trivial. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7890 From duke at openjdk.java.net Tue Mar 22 03:17:32 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 22 Mar 2022 03:17:32 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v15] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 02:52:07 GMT, Jatin Bhateja wrote: >>> A read from constant table will incur minimum of L1I access penalty to access code blob or at worst even more if data is not present in first level cache >> >> But your approach comes at a cost of frontend bandwidth and port contention, which imo are more important than latency in this case since a constant load does not prolong dependency chains. A load has very good throughput so it is often performant unless the load depends on its input (the memory location or the registers used for address calculation). Thanks > > Thanks for going into details, multicycle memory load will also defer dispatch of dependent instructions to execution port, port congestion becomes bottleneck when multiple ready instructions cannot be issued due to lack of execution resource or throughput constraints imposed by instruction, but a single cycle dependency chain may still win over latency due to pending memory operations. I think I get it now, thanks a lot for your detailed explanation. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From fyang at openjdk.java.net Tue Mar 22 03:31:16 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 22 Mar 2022 03:31:16 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v2] In-Reply-To: References: Message-ID: > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into JDK-8276799 - 8276799: Implementation of JEP 422: Linux/RISC-V Port ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6294/files - new: https://git.openjdk.java.net/jdk/pull/6294/files/33402035..a144cd20 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=00-01 Stats: 2517 lines in 698 files changed: 1371 ins; 865 del; 281 mod Patch: https://git.openjdk.java.net/jdk/pull/6294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6294/head:pull/6294 PR: https://git.openjdk.java.net/jdk/pull/6294 From dholmes at openjdk.java.net Tue Mar 22 05:15:35 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 05:15:35 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 03:31:16 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into JDK-8276799 > - 8276799: Implementation of JEP 422: Linux/RISC-V Port Hi, I've looked at everything that is not a RISC-V specific file, except for the C1 changes as the compiler folk will need to approve those. Some copyrights will need updating to 2022 on the Oracle copyright line please. I have flagged one issue in regard to C++ atomics - see below. Thanks, David make/autoconf/libraries.m4 line 152: > 150: fi > 151: > 152: # Programs which use C11 or C++11 atomics, like #include , Use of C++ atomics is not allowed in hotspot code base. See the style guide: https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md That said, I don't see any actual use of C++ atomics. ?? ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6294 From ddong at openjdk.java.net Tue Mar 22 07:29:54 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Tue, 22 Mar 2022 07:29:54 GMT Subject: RFR: 8283488: AArch64: Improve stack trace accuracy in hs log Message-ID: Hi team, Could I have a review of this patch? The native stack trace in hs log is not accurate sometime since we cannot get the accurate `sender sp`, and `sp` is the key to walk stack for compiled frames. frame os::get_sender_for_C_frame(frame* fr) { return frame(fr->link(), fr->link(), fr->sender_pc()); } JDK-8277948[1] solved the problem but the premise is that PreserveFramePointer needs to be enabled. For x86 platform, we can get the `sender sp` by `fp + 2`, but it does not hold in Aarch64. According to "Procedure Call Standard for the Arm? 64-bit Architecture (AArch64)[2]", section "6.2.3 The Frame Pointer" describes that the location of the frame record within a stack frame is not specified. Hence, I cannot get the `sender sp` by adding a constant to `fp`. By the way, I found that in the executable I compiled on mac m1, like x86, the frame record is always at the bottom of the stack, but I didn't find a standard specification to prove it. If we can guarantee that this is the case, we can simplify the solution on the mac This patch deduces the `sender sp` by decoding the native instructions, this solution is applicable to both Mac and Linux I think. At present, I found that there are mainly three patterns as follows: a) stp x29, x30, [sp, #-N]! mov x29, sp => sender sp = fp + N b) sub sp, sp, #N1 stp x29, x30, [sp, #N2] add x29, sp, #N2 => sender sp = fp + (N1 - N2) c) stp Xt1, Xt2, [sp, #-N1]! ; Xt1 is not x29, Xt2 is not x30 stp x29, x30, [sp, #N2] add x29, sp, #N2 => sender sp = fp + (N1 - N2) In addition, special treatment is required for two cases, you can refer to the comments in the code. To reduce the impact, deducing the `sender sp` is occurred only when a VM error is reported. I'm not sure if this solution is acceptable as it is a bit tricky, any input is appreciated. Worth mentioning, the stack trace may still not be accurate sometimes even if this patch is applied. One of the reasons is that `os::is_first_C_frame` will check the `sender fp`. Since `fp` is used as a general register in JIT(When PreserveFramePointer is diabled), it is usually not a reasonable `fp` value in the case of `jit code -> c code`, we may consider modifying the implementation of `os::is_first_C_frame` to apply this case. [1] https://bugs.openjdk.java.net/browse/JDK-8277948 [2] https://github.com/ARM-software/abi-aa/blob/320a56971fdcba282b7001cf4b84abb4fd993131/aapcs64/aapcs64.rst#the-frame-pointer Thanks, Denghui ------------- Commit messages: - 8283488: AArch64: Improve stack trace accuracy in hs log Changes: https://git.openjdk.java.net/jdk/pull/7900/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7900&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283488 Stats: 136 lines in 4 files changed: 134 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7900.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7900/head:pull/7900 PR: https://git.openjdk.java.net/jdk/pull/7900 From thartmann at openjdk.java.net Tue Mar 22 07:41:37 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 22 Mar 2022 07:41:37 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v18] In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 20:19:08 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - 8279508: Using an explicit scratch register since rscratch1 is bound to r10 and its usage is transparent to compiler. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - 8279508: Windows build failure fix. > - 8279508: Styling comments resolved. > - 8279508: Creating separate test for round double under feature check. > - 8279508: Reducing the invocation count and compile thresholds for RoundTests.java. > - 8279508: Review comments resolution. > - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - 8279508: Removing +LogCompilation flag. > - ... and 12 more: https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf Sure, I'll re-run testing and report back. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From thartmann at openjdk.java.net Tue Mar 22 08:02:51 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 22 Mar 2022 08:02:51 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v6] In-Reply-To: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> References: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> Message-ID: On Mon, 14 Mar 2022 06:13:30 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Update a few comments I tested this this with and without `-XX:+PostLoopMultiversioning`. All tests passed. The changes look good to me but someone with a better understanding of the old implementation should look at this as well (@vnkozlov, @sviswanathan ?). ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6828 From aph at openjdk.java.net Tue Mar 22 09:21:36 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 22 Mar 2022 09:21:36 GMT Subject: RFR: 8283488: AArch64: Improve stack trace accuracy in hs log In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 07:23:34 GMT, Denghui Dong wrote: > I'm not sure if this solution is acceptable as it is a bit tricky, any input is appreciated. Not really, no. It's too hacky and fragile for inclusion in mainline. But there is a correct way to handle this: use libunwind, and walk the stack in a precise way. That would be best for Linux, and perhaps for some other operating systems too. ------------- PR: https://git.openjdk.java.net/jdk/pull/7900 From stuefe at openjdk.java.net Tue Mar 22 09:29:00 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 22 Mar 2022 09:29:00 GMT Subject: RFR: JDK-8283497: [windows] print TMP and TEMP in hs_err and VM.info Message-ID: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> Trivial change to add TMP and TEMP - important e.g. to analyze problems with jdk.attach - to the list of environment variables we print into hs-err files and jcmd VM.info. ------------- Commit messages: - add TMP and TEMP to list of environment variables Changes: https://git.openjdk.java.net/jdk/pull/7901/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7901&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283497 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7901.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7901/head:pull/7901 PR: https://git.openjdk.java.net/jdk/pull/7901 From xgong at openjdk.java.net Tue Mar 22 09:58:23 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 22 Mar 2022 09:58:23 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API [v2] In-Reply-To: References: Message-ID: > The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. > And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. > > To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: > - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. > - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. > > So with this patch, the following transformations are applied: > > For non-masked negation with NEON: > > movi v16.4s, #0x0 > sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s > > and with SVE: > > mov z16.s, #0 > sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s > > For masked negation with NEON: > > movi v17.4s, #0x1 > mvn v19.16b, v18.16b > mov v20.16b, v16.16b ==> neg v18.4s, v17.4s > bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b > add v19.4s, v20.4s, v17.4s > mov v18.16b, v16.16b > bsl v18.16b, v19.16b, v20.16b > > and with SVE: > > mov z16.s, #-1 > mov z17.s, #1 ==> neg z16.s, p0/m, z16.s > eor z18.s, p0/m, z18.s, z16.s > add z18.s, p0/m, z18.s, z17.s > > Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): > > NEON: > Benchmark Gain > Byte128Vector.NEG 1.029 > Byte128Vector.NEGMasked 1.757 > Short128Vector.NEG 1.041 > Short128Vector.NEGMasked 1.659 > Int128Vector.NEG 1.005 > Int128Vector.NEGMasked 1.513 > Long128Vector.NEG 1.003 > Long128Vector.NEGMasked 1.878 > > SVE with 512-bits: > Benchmark Gain > ByteMaxVector.NEG 1.10 > ByteMaxVector.NEGMasked 1.165 > ShortMaxVector.NEG 1.056 > ShortMaxVector.NEGMasked 1.195 > IntMaxVector.NEG 1.002 > IntMaxVector.NEGMasked 1.239 > LongMaxVector.NEG 1.031 > LongMaxVector.NEGMasked 1.191 > > X86 (non AVX-512): > Benchmark Gain > ByteMaxVector.NEGMasked 1.254 > ShortMaxVector.NEGMasked 1.359 > IntMaxVector.NEGMasked 1.431 > LongMaxVector.NEGMasked 1.989 > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Add a superclass for vector negation ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7782/files - new: https://git.openjdk.java.net/jdk/pull/7782/files/828866f8..97c8119a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7782&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7782&range=00-01 Stats: 64 lines in 4 files changed: 16 ins; 13 del; 35 mod Patch: https://git.openjdk.java.net/jdk/pull/7782.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7782/head:pull/7782 PR: https://git.openjdk.java.net/jdk/pull/7782 From xgong at openjdk.java.net Tue Mar 22 09:58:23 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 22 Mar 2022 09:58:23 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> Message-ID: On Sat, 19 Mar 2022 03:11:12 GMT, Jie Fu wrote: >>> Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: >>> >>> double negate(double arg) {return -0.0 - arg; } >>> >>> This is to handle signed zeros correctly. >> >> Hi @jddarcy ,thanks for looking at this PR and thanks for the notes on the floating point negation! Yeah, this really makes sense to me. Kindly note that this patch didn't touch the negation of the floating point values. For Vector API, the vector floating point negation has been intrinsified to `NegVF/D` node by compiler that we directly generate the negation instructions for them. Thanks! > >> Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: >> >> double negate(double arg) {return -0.0 - arg; } >> >> This is to handle signed zeros correctly. > > This seems easy to be broken by an opt enhancement. > Just wondering do we have a jtreg test for this point? @jddarcy > Thanks. Hi @DamonFool , thanks for your review! All the comments have been addressed. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From mcimadamore at openjdk.java.net Tue Mar 22 10:11:47 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 22 Mar 2022 10:11:47 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address review comments * Use `new` instead of `fresh` * Drop use of `new` where caching might be used * Remove unused imports * Add static imports to make code more succint * Fix other typos ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/8e6017dc..6bb1b5c9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=00-01 Stats: 83 lines in 10 files changed: 2 ins; 7 del; 74 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From ddong at openjdk.java.net Tue Mar 22 11:13:35 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Tue, 22 Mar 2022 11:13:35 GMT Subject: RFR: 8283488: AArch64: Improve stack trace accuracy in hs log In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 09:17:48 GMT, Andrew Haley wrote: > > I'm not sure if this solution is acceptable as it is a bit tricky, any input is appreciated. > > Not really, no. It's too hacky and fragile for inclusion in mainline. But there is a correct way to handle this: use libunwind, and walk the stack in a precise way. That would be best for Linux, and perhaps for some other operating systems too. Thanks, I will try to use libunwind. ------------- PR: https://git.openjdk.java.net/jdk/pull/7900 From fyang at openjdk.java.net Tue Mar 22 11:50:13 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 22 Mar 2022 11:50:13 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: References: Message-ID: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 Fei Yang has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6294/files - new: https://git.openjdk.java.net/jdk/pull/6294/files/a144cd20..b7a31729 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=01-02 Stats: 44 lines in 41 files changed: 0 ins; 1 del; 43 mod Patch: https://git.openjdk.java.net/jdk/pull/6294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6294/head:pull/6294 PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Tue Mar 22 11:50:17 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 22 Mar 2022 11:50:17 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 03:31:16 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into JDK-8276799 > - 8276799: Implementation of JEP 422: Linux/RISC-V Port > Build changes look good. I can't say anything about the rest of the code. > > /reviewers 3 Thanks again for looking at the build changes :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Tue Mar 22 11:53:34 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 22 Mar 2022 11:53:34 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 05:12:46 GMT, David Holmes wrote: > Hi, > > I've looked at everything that is not a RISC-V specific file, except for the C1 changes as the compiler folk will need to approve those. > > Some copyrights will need updating to 2022 on the Oracle copyright line please. Hi David, I have pushed one more commit updating the Oralce copyright line for existing files touched. Thanks for looking at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Tue Mar 22 12:11:36 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Tue, 22 Mar 2022 12:11:36 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 04:13:17 GMT, David Holmes wrote: >> Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into JDK-8276799 >> - 8276799: Implementation of JEP 422: Linux/RISC-V Port > > make/autoconf/libraries.m4 line 152: > >> 150: fi >> 151: >> 152: # Programs which use C11 or C++11 atomics, like #include , > > Use of C++ atomics is not allowed in hotspot code base. See the style guide: > https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md > > That said, I don't see any actual use of C++ atomics. ?? I think the old code comment here is a bit too general. It does not mean we introduce any use of C++ atomics here. The fact is that RISC-V only has word-sized atomics, it requries libatomic where other common architectures do not [1]. So atomic support would require explicit linking against -latomic on RISC-V. Otherwise we got build errors like: ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From dholmes at openjdk.java.net Tue Mar 22 12:58:32 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 12:58:32 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: On Tue, 22 Mar 2022 11:50:13 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From dholmes at openjdk.java.net Tue Mar 22 12:58:32 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Mar 2022 12:58:32 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 12:08:01 GMT, Fei Yang wrote: >> make/autoconf/libraries.m4 line 152: >> >>> 150: fi >>> 151: >>> 152: # Programs which use C11 or C++11 atomics, like #include , >> >> Use of C++ atomics is not allowed in hotspot code base. See the style guide: >> https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md >> >> That said, I don't see any actual use of C++ atomics. ?? > > I think the old code comment here is a bit too general. It does not mean we introduce any use of C++ atomics here. > The fact is that RISC-V only has word-sized atomics, it requries libatomic where other common architectures do not [1]. > So atomic support would require explicit linking against -latomic on RISC-V. Otherwise we got build errors like: New comment looks good - thanks for clarifying. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From zgu at openjdk.java.net Tue Mar 22 13:31:33 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 22 Mar 2022 13:31:33 GMT Subject: RFR: 8283456: Make CompiledICHolder::live_count/live_not_claimed_count debug only In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 02:55:23 GMT, David Holmes wrote: >> Please review this trivial patch to make `CompiledICHolder::live_count/live_not_claimed_count` debug only, since they are only updated/used in debug only code. > > Seems fine and trivial. > > Thanks, > David Thanks, @dholmes-ora ------------- PR: https://git.openjdk.java.net/jdk/pull/7890 From zgu at openjdk.java.net Tue Mar 22 13:34:38 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 22 Mar 2022 13:34:38 GMT Subject: Integrated: 8283456: Make CompiledICHolder::live_count/live_not_claimed_count debug only In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 13:47:59 GMT, Zhengyu Gu wrote: > Please review this trivial patch to make `CompiledICHolder::live_count/live_not_claimed_count` debug only, since they are only updated/used in debug only code. This pull request has now been integrated. Changeset: c0f984e5 Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/c0f984e5fbba7b44fa7b0a4309896ef9ccb4e666 Stats: 8 lines in 2 files changed: 5 ins; 0 del; 3 mod 8283456: Make CompiledICHolder::live_count/live_not_claimed_count debug only Reviewed-by: dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7890 From mcimadamore at openjdk.java.net Tue Mar 22 14:04:07 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 22 Mar 2022 14:04:07 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with three additional commits since the last revision: - rename syslookup lib on Windows - Add missing LIBS flag - Simplify syslookup build changes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/6bb1b5c9..4b2760d3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=01-02 Stats: 28 lines in 3 files changed: 1 ins; 23 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Tue Mar 22 14:04:10 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 22 Mar 2022 14:04:10 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 17:36:53 GMT, Maurizio Cimadamore wrote: >> make/modules/java.base/Lib.gmk line 217: >> >>> 215: CXXFLAGS := $(CXXFLAGS_JDKLIB), \ >>> 216: LDFLAGS := $(LDFLAGS_JDKLIB) -Wl$(COMMA)--no-as-needed, \ >>> 217: LIBS := $(LIBCXX) -lc -lm -ldl, \ >> >> Instead of repeating this whole macro call for both Linux and non Linux, you can use parameters of the form LDFLAGS_linux and LIBS_linux to add the Linux specific flags. Something like this: >> >> >> LDFLAGS := $(LDFLAGS_JDKLIB), \ >> LDFLAGS_linux := -Wl$(COMMA)--no-as-needed, \ >> >> >> For the NAME field, there is no such support, so the way we usually do that is through a variable and conditionals before the macro call. What's the reason to have a different lib name on Windows? If they were the same, and the source file in windows/native/... had the same name, it would just automatically override in the build. >> >> I realize now that this is just moved code from jdk.incubator.foreign, and this patch is probably big enough as it is so no need to refactor the build logic at the same time. > > Good points - there is really no need AFAIK for the lib name to be different. I'll do few experiments. I've fixed the makefile as you suggested - I agree the result is much simpler. I've tested the changes on mac/linux/win and everything looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From rriggs at openjdk.java.net Tue Mar 22 14:04:34 2022 From: rriggs at openjdk.java.net (Roger Riggs) Date: Tue, 22 Mar 2022 14:04:34 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: On Tue, 22 Mar 2022 11:50:13 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments The test/jdk files look ok. (I didn't look at the rest) ------------- Marked as reviewed by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6294 From shade at openjdk.java.net Tue Mar 22 14:41:32 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 22 Mar 2022 14:41:32 GMT Subject: RFR: 8283257: x86: Clean up invocation/branch counter updates code In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 12:07:55 GMT, Aleksey Shipilev wrote: > I looked briefly at optimizing `InterpreterMacroAssembler::increment_mask_and_jump` a bit, but it looks that current code is the best we can do. This improvement does a few related cleanups without semantic changes. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Eyeballing interpreter generated code Any reviews? :) ------------- PR: https://git.openjdk.java.net/jdk/pull/7838 From redestad at openjdk.java.net Tue Mar 22 15:00:35 2022 From: redestad at openjdk.java.net (Claes Redestad) Date: Tue, 22 Mar 2022 15:00:35 GMT Subject: RFR: 8283257: x86: Clean up invocation/branch counter updates code In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 12:07:55 GMT, Aleksey Shipilev wrote: > I looked briefly at optimizing `InterpreterMacroAssembler::increment_mask_and_jump` a bit, but it looks that current code is the best we can do. This improvement does a few related cleanups without semantic changes. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Eyeballing interpreter generated code (I thought I reviewed this, but my comments got lost to the void.) Code changes look good IMO. There are known issue that there can be heavy contention on some profiling counters, especially in synthetic, heavily multi-threaded benchmarks (SPECjvm2008), see [JDK-8134940](https://bugs.openjdk.java.net/browse/JDK-8134940). I recall @veresov did some experiments several years ago to attempt to address that, but AFAIK nothing ever came of that. This is a tricky area where I think it'd be good if also failed experiments were documented in detail. ------------- Marked as reviewed by redestad (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7838 From erikj at openjdk.java.net Tue Mar 22 16:58:35 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 22 Mar 2022 16:58:35 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 14:04:07 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > Maurizio Cimadamore has updated the pull request incrementally with three additional commits since the last revision: > > - rename syslookup lib on Windows > - Add missing LIBS flag > - Simplify syslookup build changes make/modules/java.base/Lib.gmk line 217: > 215: LDFLAGS_linux := -Wl$(COMMA)--no-as-needed, \ > 216: LIBS := $(LIBCXX), \ > 217: LIBS_linux := -lc -lm -ldl, \ This looks much better, thanks! Now if you could just fix the indentation of the parameters to 4 spaces, it would be perfect. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From kvn at openjdk.java.net Tue Mar 22 17:07:38 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 22 Mar 2022 17:07:38 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: On Tue, 22 Mar 2022 11:50:13 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments I looked on C1/C2 changes and compiler tests. Seems reasonable. But before approval I would need to run changes through our testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From kvn at openjdk.java.net Tue Mar 22 17:37:36 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 22 Mar 2022 17:37:36 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: On Tue, 22 Mar 2022 11:50:13 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments src/hotspot/cpu/riscv/disassembler_riscv.hpp line 18: > 16: * > 17: * You should have received a copy of the GNU General Public License version > 18: * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. These 2 lines merged into 1 accidentally causing failure in copyright headers verification. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From mcimadamore at openjdk.java.net Tue Mar 22 19:07:12 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 22 Mar 2022 19:07:12 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v4] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix indentation in Lib.gmk ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/4b2760d3..7ec71f73 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=02-03 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From kvn at openjdk.java.net Tue Mar 22 19:12:33 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 22 Mar 2022 19:12:33 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 08:27:25 GMT, Jie Fu wrote: > Hi all, > > As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. > It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. > Please review it. > > Thanks. > Best regards, > Jie src/hotspot/share/interpreter/templateInterpreter.cpp line 51: > 49: NOT_PRODUCT(code_size *= 4;) // debug uses extra interpreter code space > 50: int max_aligned_codelets = 280; > 51: int max_aligned_bytes = max_aligned_codelets * CodeEntryAlignment * 2; Please explain in comment where these numbers (280, *2) are coming from and why you need additional size. src/hotspot/share/prims/methodHandles.cpp line 93: > 91: TraceTime timer("MethodHandles adapters generation", TRACETIME_LOG(Info, startuptime)); > 92: int adapter_num = (int)Interpreter::method_handle_invoke_LAST - (int)Interpreter::method_handle_invoke_FIRST + 1; > 93: int max_aligned_bytes = adapter_num * CodeEntryAlignment; Add comment that we need additional bytes due to alignment. ------------- PR: https://git.openjdk.java.net/jdk/pull/7851 From aph-open at littlepinkcloud.com Tue Mar 22 19:57:57 2022 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Tue, 22 Mar 2022 19:57:57 +0000 Subject: RFR: 8283488: AArch64: Improve stack trace accuracy in hs log In-Reply-To: References: Message-ID: On 3/22/22 11:13, Denghui Dong wrote: >> Not really, no. It's too hacky and fragile for inclusion in mainline. But there is a correct way to handle this: use libunwind, and walk the stack in a precise way. That would be best for Linux, and perhaps for some other operating systems too. > Thanks, I will try to use libunwind. > > ------------- > > PR:https://git.openjdk.java.net/jdk/pull/7900 I'm pretty sure that libunwind will work from a technical point of view, but there may be complex issues to do with libunwind availability, licensing, and so on. Please bring this to discussion early, maybe evenbefore you have a complete working solution. From kvn at openjdk.java.net Tue Mar 22 20:19:34 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 22 Mar 2022 20:19:34 GMT Subject: RFR: 8283257: x86: Clean up invocation/branch counter updates code In-Reply-To: References: Message-ID: <5FiKAkwji5FB5DKP6kzwZXtBxs_rL06lXIZREnuCy84=.3d5e6015-e2f9-48b6-9687-3b80cd0b0cb8@github.com> On Wed, 16 Mar 2022 12:07:55 GMT, Aleksey Shipilev wrote: > I looked briefly at optimizing `InterpreterMacroAssembler::increment_mask_and_jump` a bit, but it looks that current code is the best we can do. This improvement does a few related cleanups without semantic changes. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Eyeballing interpreter generated code Good refactoring. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7838 From erikj at openjdk.java.net Tue Mar 22 20:46:31 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 22 Mar 2022 20:46:31 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v4] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 19:07:12 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indentation in Lib.gmk Marked as reviewed by erikj (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From ysuenaga at openjdk.java.net Wed Mar 23 01:01:32 2022 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 23 Mar 2022 01:01:32 GMT Subject: RFR: JDK-8283497: [windows] print TMP and TEMP in hs_err and VM.info In-Reply-To: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> References: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> Message-ID: On Tue, 22 Mar 2022 09:17:18 GMT, Thomas Stuefe wrote: > Trivial change to add TMP and TEMP - important e.g. to analyze problems with jdk.attach - to the list of environment variables we print into hs-err files and jcmd VM.info. Looks good. I think this change is trivial. ------------- Marked as reviewed by ysuenaga (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7901 From fyang at openjdk.java.net Wed Mar 23 02:03:26 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 23 Mar 2022 02:03:26 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v4] In-Reply-To: References: Message-ID: > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 Fei Yang has updated the pull request incrementally with one additional commit since the last revision: Fix copyright header ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6294/files - new: https://git.openjdk.java.net/jdk/pull/6294/files/b7a31729..d8bef7fa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=02-03 Stats: 28 lines in 13 files changed: 14 ins; 0 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/6294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6294/head:pull/6294 PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Wed Mar 23 02:03:27 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 23 Mar 2022 02:03:27 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: On Tue, 22 Mar 2022 14:01:28 GMT, Roger Riggs wrote: > The test/jdk files look ok. (I didn't look at the rest) Thank you for looking at that part. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Wed Mar 23 02:03:27 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 23 Mar 2022 02:03:27 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: On Tue, 22 Mar 2022 17:34:18 GMT, Vladimir Kozlov wrote: >> Fei Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/cpu/riscv/disassembler_riscv.hpp line 18: > >> 16: * >> 17: * You should have received a copy of the GNU General Public License version >> 18: * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. > > These 2 lines merged into 1 accidentally causing failure in copyright headers verification. > I looked on C1/C2 changes and compiler tests. Seems reasonable. But before approval I would need to run changes through our testing. That's great to hear :-) Thanks for the efforts. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Wed Mar 23 02:03:28 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 23 Mar 2022 02:03:28 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v3] In-Reply-To: References: <-oQETK4V8ppli_-iya4r39Y3KnIlgZLQblw4kn5rPBQ=.7a89b832-8b1a-4135-bd4a-a2474d52966e@github.com> Message-ID: <3ImDU5mN1i84E9dzISTXAcGY6JrlzjHUjP7iTaaDgoo=.e1500563-8449-42ac-aa38-964f2995bc3d@github.com> On Wed, 23 Mar 2022 01:57:25 GMT, Fei Yang wrote: >> src/hotspot/cpu/riscv/disassembler_riscv.hpp line 18: >> >>> 16: * >>> 17: * You should have received a copy of the GNU General Public License version >>> 18: * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >> >> These 2 lines merged into 1 accidentally causing failure in copyright headers verification. > >> I looked on C1/C2 changes and compiler tests. Seems reasonable. But before approval I would need to run changes through our testing. > > That's great to hear :-) Thanks for the efforts. I have fixed the copyright headers verification problem. Please take another look. ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From jiefu at openjdk.java.net Wed Mar 23 02:08:01 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 23 Mar 2022 02:08:01 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag [v2] In-Reply-To: References: Message-ID: > Hi all, > > As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. > It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. > Please review it. > > Thanks. > Best regards, > Jie Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Add comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7851/files - new: https://git.openjdk.java.net/jdk/pull/7851/files/c84c659c..af65cbaa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7851&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7851&range=00-01 Stats: 4 lines in 2 files changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7851.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7851/head:pull/7851 PR: https://git.openjdk.java.net/jdk/pull/7851 From jiefu at openjdk.java.net Wed Mar 23 02:08:03 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 23 Mar 2022 02:08:03 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 08:27:25 GMT, Jie Fu wrote: > Hi all, > > As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. > It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. > Please review it. > > Thanks. > Best regards, > Jie Thanks @vnkozlov for the review. The comments had been added in the code. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7851 From kvn at openjdk.java.net Wed Mar 23 02:20:28 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 23 Mar 2022 02:20:28 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v4] In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 02:03:26 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright header Update looks good. Testing results are also good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6294 From dholmes at openjdk.java.net Wed Mar 23 02:33:38 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 23 Mar 2022 02:33:38 GMT Subject: RFR: JDK-8283497: [windows] print TMP and TEMP in hs_err and VM.info In-Reply-To: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> References: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> Message-ID: On Tue, 22 Mar 2022 09:17:18 GMT, Thomas Stuefe wrote: > Trivial change to add TMP and TEMP - important e.g. to analyze problems with jdk.attach - to the list of environment variables we print into hs-err files and jcmd VM.info. Seems reasonable. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7901 From ddong at openjdk.java.net Wed Mar 23 02:52:33 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 23 Mar 2022 02:52:33 GMT Subject: RFR: 8283488: AArch64: Improve stack trace accuracy in hs log In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 19:59:48 GMT, Andrew Haley wrote: > Please bring this to discussion early, maybe evenbefore you have a complete working solution. Sure. I found that there was a previous discussion on a similar issue that also mentioned libunwind, but seems no further progress. https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-April/041315.html ------------- PR: https://git.openjdk.java.net/jdk/pull/7900 From kvn at openjdk.java.net Wed Mar 23 03:17:33 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 23 Mar 2022 03:17:33 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag [v2] In-Reply-To: References: Message-ID: <0sM4bO2czxZu80M_Ks_NKJ3qCBKJp7uiXeaOro6D-hU=.710e3a46-e37b-4a1b-a7db-c6afb7f12e50@github.com> On Wed, 23 Mar 2022 02:08:01 GMT, Jie Fu wrote: >> Hi all, >> >> As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. >> It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. >> Please review it. >> >> Thanks. >> Best regards, >> Jie > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > Add comments Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7851 From jiefu at openjdk.java.net Wed Mar 23 05:11:03 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 23 Mar 2022 05:11:03 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag [v3] In-Reply-To: References: Message-ID: <0aNuE9jmQfF_CFKPGuzXD3ffxqdheX26U75JW9nE9aI=.e56eb4ce-b953-4fb2-8f4b-c70703feb1f7@github.com> > Hi all, > > As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. > It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. > Please review it. > > Thanks. > Best regards, > Jie Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Fix the comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7851/files - new: https://git.openjdk.java.net/jdk/pull/7851/files/af65cbaa..4d6c11e6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7851&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7851&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7851.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7851/head:pull/7851 PR: https://git.openjdk.java.net/jdk/pull/7851 From jiefu at openjdk.java.net Wed Mar 23 05:11:04 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 23 Mar 2022 05:11:04 GMT Subject: RFR: 8283298: Make CodeCacheSegmentSize a product flag [v2] In-Reply-To: <0sM4bO2czxZu80M_Ks_NKJ3qCBKJp7uiXeaOro6D-hU=.710e3a46-e37b-4a1b-a7db-c6afb7f12e50@github.com> References: <0sM4bO2czxZu80M_Ks_NKJ3qCBKJp7uiXeaOro6D-hU=.710e3a46-e37b-4a1b-a7db-c6afb7f12e50@github.com> Message-ID: On Wed, 23 Mar 2022 03:14:27 GMT, Vladimir Kozlov wrote: > Good. Thanks @vnkozlov . I pushed one more commit to fix a comment typo. ------------- PR: https://git.openjdk.java.net/jdk/pull/7851 From stuefe at openjdk.java.net Wed Mar 23 06:10:35 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 23 Mar 2022 06:10:35 GMT Subject: RFR: JDK-8283497: [windows] print TMP and TEMP in hs_err and VM.info In-Reply-To: References: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> Message-ID: On Wed, 23 Mar 2022 00:58:09 GMT, Yasumasa Suenaga wrote: >> Trivial change to add TMP and TEMP - important e.g. to analyze problems with jdk.attach - to the list of environment variables we print into hs-err files and jcmd VM.info. > > Looks good. I think this change is trivial. Thanks @YaSuenag and @dholmes-ora ! ------------- PR: https://git.openjdk.java.net/jdk/pull/7901 From stuefe at openjdk.java.net Wed Mar 23 06:10:36 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 23 Mar 2022 06:10:36 GMT Subject: Integrated: JDK-8283497: [windows] print TMP and TEMP in hs_err and VM.info In-Reply-To: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> References: <3cxtPJEHOkU4HYu3BMDCEO3rlzgcDN5nmcaTLzJ-Sw8=.e5163e03-9634-4382-bc69-85e9a7a0d9c3@github.com> Message-ID: On Tue, 22 Mar 2022 09:17:18 GMT, Thomas Stuefe wrote: > Trivial change to add TMP and TEMP - important e.g. to analyze problems with jdk.attach - to the list of environment variables we print into hs-err files and jcmd VM.info. This pull request has now been integrated. Changeset: b035fda4 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/b035fda459284fa130bf936743a8579a6888160b Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8283497: [windows] print TMP and TEMP in hs_err and VM.info Reviewed-by: ysuenaga, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7901 From shade at openjdk.java.net Wed Mar 23 06:34:34 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 23 Mar 2022 06:34:34 GMT Subject: RFR: 8283257: x86: Clean up invocation/branch counter updates code In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 12:07:55 GMT, Aleksey Shipilev wrote: > I looked briefly at optimizing `InterpreterMacroAssembler::increment_mask_and_jump` a bit, but it looks that current code is the best we can do. This improvement does a few related cleanups without semantic changes. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Eyeballing interpreter generated code Thanks for reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/7838 From shade at openjdk.java.net Wed Mar 23 06:34:35 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 23 Mar 2022 06:34:35 GMT Subject: Integrated: 8283257: x86: Clean up invocation/branch counter updates code In-Reply-To: References: Message-ID: On Wed, 16 Mar 2022 12:07:55 GMT, Aleksey Shipilev wrote: > I looked briefly at optimizing `InterpreterMacroAssembler::increment_mask_and_jump` a bit, but it looks that current code is the best we can do. This improvement does a few related cleanups without semantic changes. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Eyeballing interpreter generated code This pull request has now been integrated. Changeset: 82e1a1cf Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/82e1a1cf8bafddfa2ecf11c2ce88ed4eaa091757 Stats: 22 lines in 4 files changed: 0 ins; 6 del; 16 mod 8283257: x86: Clean up invocation/branch counter updates code Reviewed-by: redestad, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/7838 From thartmann at openjdk.java.net Wed Mar 23 06:59:36 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 23 Mar 2022 06:59:36 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v18] In-Reply-To: References: Message-ID: On Fri, 18 Mar 2022 20:19:08 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - 8279508: Using an explicit scratch register since rscratch1 is bound to r10 and its usage is transparent to compiler. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - 8279508: Windows build failure fix. > - 8279508: Styling comments resolved. > - 8279508: Creating separate test for round double under feature check. > - 8279508: Reducing the invocation count and compile thresholds for RoundTests.java. > - 8279508: Review comments resolution. > - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 > - 8279508: Removing +LogCompilation flag. > - ... and 12 more: https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf All tests passed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From shade at openjdk.java.net Wed Mar 23 07:27:33 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 23 Mar 2022 07:27:33 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v4] In-Reply-To: References: Message-ID: <15WaiDnekvPCVSDawjOcu92JSwMFrdpn-bOEycOwIYc=.a69bf14a-082a-46c5-8a72-56d15c2d4142@github.com> On Wed, 23 Mar 2022 02:03:26 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright header Looks okay to me. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6294 From aturbanov at openjdk.java.net Wed Mar 23 07:31:23 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Wed, 23 Mar 2022 07:31:23 GMT Subject: RFR: 8283426: Fix 'exeption' typo [v3] In-Reply-To: References: Message-ID: > Fix repeated typo `exeption` Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: 8283426: Fix 'exeption' typo fix more typos, found by Sean Coffey ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7879/files - new: https://git.openjdk.java.net/jdk/pull/7879/files/4c1e68ed..1baca5ea Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=01-02 Stats: 14 lines in 7 files changed: 0 ins; 0 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/7879.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7879/head:pull/7879 PR: https://git.openjdk.java.net/jdk/pull/7879 From jiefu at openjdk.java.net Wed Mar 23 08:51:36 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 23 Mar 2022 08:51:36 GMT Subject: Integrated: 8283298: Make CodeCacheSegmentSize a product flag In-Reply-To: References: Message-ID: On Thu, 17 Mar 2022 08:27:25 GMT, Jie Fu wrote: > Hi all, > > As discussed in https://github.com/openjdk/jdk/pull/7830, this patch makes `CodeCacheSegmentSize` a product flag. > It also fixes two bugs when testing the release VM with CodeEntryAlignment={512, 1024}. > Please review it. > > Thanks. > Best regards, > Jie This pull request has now been integrated. Changeset: 026b8530 Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/026b85303c01326bc49a1105a89853d7641fcd50 Stats: 14 lines in 4 files changed: 8 ins; 1 del; 5 mod 8283298: Make CodeCacheSegmentSize a product flag Reviewed-by: dlong, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/7851 From tschatzl at openjdk.java.net Wed Mar 23 08:59:47 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 23 Mar 2022 08:59:47 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers Message-ID: Hi all, can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? This relates to this code snippet: int xmm_bypass_limit = FrameMap::nof_xmm_regs; #ifdef _LP64 if (UseAVX < 3) { xmm_bypass_limit = xmm_bypass_limit / 2; } #endif Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. Thanks, Thomas ------------- Commit messages: - Further fixes - Initial version Changes: https://git.openjdk.java.net/jdk/pull/7917/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7917&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283494 Stats: 69 lines in 8 files changed: 18 ins; 37 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/7917.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7917/head:pull/7917 PR: https://git.openjdk.java.net/jdk/pull/7917 From aph at openjdk.java.net Wed Mar 23 09:49:37 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 23 Mar 2022 09:49:37 GMT Subject: RFR: 8283488: AArch64: Improve stack trace accuracy in hs log In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 02:49:18 GMT, Denghui Dong wrote: > > Please bring this to discussion early, maybe evenbefore you have a complete working solution. > > Sure. I found that there was a previous discussion on a similar issue that also mentioned libunwind, but seems no further progress. > > https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-April/041315.html Thanks! I have never seen that before. I'd have a good look at GCC's unwinder library, because if it 's suitable we won't need an other library. I'd rather not add new dependencies. ------------- PR: https://git.openjdk.java.net/jdk/pull/7900 From aivanov at openjdk.java.net Wed Mar 23 10:15:36 2022 From: aivanov at openjdk.java.net (Alexey Ivanov) Date: Wed, 23 Mar 2022 10:15:36 GMT Subject: RFR: 8283426: Fix 'exeption' typo [v3] In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 07:31:23 GMT, Andrey Turbanov wrote: >> Fix repeated typo `exeption` > > Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: > > 8283426: Fix 'exeption' typo > fix more typos, found by Sean Coffey Marked as reviewed by aivanov (Reviewer). test/jdk/javax/sql/testng/test/rowset/serial/SerialJavaObjectTests.java line 46: > 44: > 45: /* > 46: * Validate that an SerialException is thrown when the object specified Suggestion: * Validate that a SerialException is thrown when the object specified ------------- PR: https://git.openjdk.java.net/jdk/pull/7879 From mcimadamore at openjdk.java.net Wed Mar 23 14:06:56 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 23 Mar 2022 14:06:56 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Drop redundant javadoc statements re. handling of nulls (handling of nulls is specified once and for all in the package javadoc) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/7ec71f73..c9bc9a70 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=03-04 Stats: 12 lines in 2 files changed: 3 ins; 8 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From kvn at openjdk.java.net Wed Mar 23 15:41:28 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 23 Mar 2022 15:41:28 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 18:39:33 GMT, Kim Barrett wrote: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7704 From stefank at openjdk.java.net Wed Mar 23 15:43:00 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 23 Mar 2022 15:43:00 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code Message-ID: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> We have at least three ways to check the type of a given Klass*: 1) Using the Klass::_id field 2) Using the layout helper 3) Using the InstanceKlass::_kind field The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. ------------- Commit messages: - 8283574: Use Klass::_id for type checks in the C++ code Changes: https://git.openjdk.java.net/jdk/pull/7922/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7922&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283574 Stats: 69 lines in 11 files changed: 4 ins; 29 del; 36 mod Patch: https://git.openjdk.java.net/jdk/pull/7922.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7922/head:pull/7922 PR: https://git.openjdk.java.net/jdk/pull/7922 From stefank at openjdk.java.net Wed Mar 23 15:43:00 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 23 Mar 2022 15:43:00 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> Message-ID: On Wed, 23 Mar 2022 14:42:06 GMT, Stefan Karlsson wrote: > We have at least three ways to check the type of a given Klass*: > 1) Using the Klass::_id field > 2) Using the layout helper > 3) Using the InstanceKlass::_kind field > > The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. > > The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). > > I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. Currently running through tier1-3 ------------- PR: https://git.openjdk.java.net/jdk/pull/7922 From tschatzl at openjdk.java.net Wed Mar 23 16:03:28 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 23 Mar 2022 16:03:28 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> Message-ID: On Wed, 23 Mar 2022 14:42:06 GMT, Stefan Karlsson wrote: > We have at least three ways to check the type of a given Klass*: > 1) Using the Klass::_id field > 2) Using the layout helper > 3) Using the InstanceKlass::_kind field > > The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. > > The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). > > I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7922 From kvn at openjdk.java.net Wed Mar 23 16:10:28 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 23 Mar 2022 16:10:28 GMT Subject: RFR: 8263134: HotSpot Style Guide should disallow inheriting constructors In-Reply-To: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> References: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Message-ID: On Fri, 4 Mar 2022 15:04:47 GMT, Kim Barrett wrote: > Please review this change to explicitly disallow the use of inheriting > constructors: > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2540.htm). > > The C++11/14 specification has a lot of problems. These were addressed in > C++17 (and as a DR that affects C++11/14): > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0136r1.html). > > Use of inheriting constructors now runs the risk of encountering those bugs, > inconsistent behavior between different compilers or compiler versions, and > behavior changes for future support of C++17. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Approved. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7698 From jrose at openjdk.java.net Wed Mar 23 16:56:25 2022 From: jrose at openjdk.java.net (John R Rose) Date: Wed, 23 Mar 2022 16:56:25 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> Message-ID: On Wed, 23 Mar 2022 14:42:06 GMT, Stefan Karlsson wrote: > We have at least three ways to check the type of a given Klass*: > 1) Using the Klass::_id field > 2) Using the layout helper > 3) Using the InstanceKlass::_kind field > > The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. > > The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). > > I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. Consolidating two ad hoc tags, from kind+id to id, is good. Using the value of id to gate the klass-subtype checks is good. There's something bad here, which unfortunately came in before but it now getting greater prominence: The word itself, "id". That (a) conveys very little, leaving the user to guess a alot, and (b) the understandable guess will be wrong. The "ID" of a class, if it is anything one might predict, is not its "kind" but rather its identity, as in `System.identityHashCode`. This is the use of ID elsewhere in the source base, and raising the profile of Klass::id in this way creates conceptual conflicts. For more regular uses of the term "ID" please see `vmSymbols.hpp`, `vmIntrinsics.hpp`, `vmClasses.hpp`. In all of these cases, the thing we call an ID is an enum member which exactly identifies the identity of a particular well-known name. I request that, as you remove `kind` (which was the more informative name, though it was less functional), you also rename `id`/`ID` to something along the lines of `kind_id`/`KindID` or even drop "id" and say just `klass_kind` or even `kind` . I realize this makes the change set larger, but this vague and misleading word "id" should not be getting more entrenched in this central API. ------------- PR: https://git.openjdk.java.net/jdk/pull/7922 From amenkov at openjdk.java.net Wed Mar 23 18:35:34 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Wed, 23 Mar 2022 18:35:34 GMT Subject: Integrated: 8282241: Invalid generic signature for redefined classes In-Reply-To: References: Message-ID: On Thu, 3 Mar 2022 15:07:05 GMT, Alex Menkov wrote: > JDK-8238048 (fixed in jdk15) moved major_version, minor_version, generic_signature_index and source_file_name_index from InstanceKlass to ConstantPool. > We still have some incorrect code in CP merge during class redefinition. > > rewrite_cp_refs(scratch_class) updates generic_signature_index and source_file_name_index in the scratch_cp, so we need to copy the attributes (merge_cp->copy_fields(scratch_cp())) after rewrite_cp_refs. > > In redefine_single_class we don't need to copy source_file_name_index because it's a CP property and we swap CPs. So this copying actually sets the value from old class. > > tested: > - test/jdk/java/lang/instrument > - test/hotspot/jtreg/serviceability/jvmti/RedefineClasses > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/RetransformClasses This pull request has now been integrated. Changeset: f0177395 Author: Alex Menkov URL: https://git.openjdk.java.net/jdk/commit/f01773956fbc092b00c18392735a020ca05257ed Stats: 212 lines in 2 files changed: 202 ins; 7 del; 3 mod 8282241: Invalid generic signature for redefined classes Reviewed-by: coleenp, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/7676 From stefank at openjdk.java.net Wed Mar 23 19:13:28 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 23 Mar 2022 19:13:28 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> Message-ID: On Wed, 23 Mar 2022 16:53:35 GMT, John R Rose wrote: >> We have at least three ways to check the type of a given Klass*: >> 1) Using the Klass::_id field >> 2) Using the layout helper >> 3) Using the InstanceKlass::_kind field >> >> The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. >> >> The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). >> >> I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. > > Consolidating two ad hoc tags, from kind+id to id, is good. Using the value of id to gate the klass-subtype checks is good. > > There's something bad here, which unfortunately came in before but it now getting greater prominence: The word itself, "id". That (a) conveys very little, leaving the user to guess a alot, and (b) the understandable guess will be wrong. The "ID" of a class, if it is anything one might predict, is not its "kind" but rather its identity, as in `System.identityHashCode`. This is the use of ID elsewhere in the source base, and raising the profile of Klass::id in this way creates conceptual conflicts. For more regular uses of the term "ID" please see `vmSymbols.hpp`, `vmIntrinsics.hpp`, `vmClasses.hpp`. In all of these cases, the thing we call an ID is an enum member which exactly identifies the identity of a particular well-known name. > > I request that, as you remove `kind` (which was the more informative name, though it was less functional), you also rename `id`/`ID` to something along the lines of `kind_id`/`KindID` or even drop "id" and say just `klass_kind` or even `kind` . I realize this makes the change set larger, but this vague and misleading word "id" should not be getting more entrenched in this central API. @rose00 Sure. Let's rename the field and type to _kind/KlassKind instead. I'll do that as a separate patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/7922 From aturbanov at openjdk.java.net Wed Mar 23 19:41:58 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Wed, 23 Mar 2022 19:41:58 GMT Subject: RFR: 8283426: Fix 'exeption' typo [v4] In-Reply-To: References: Message-ID: <48NxGvP43iL5AnO2fxDF1vmpTZ6RxcQp4x6oaHxkEOQ=.ccd35bda-80d8-40da-8a89-bb004adc2ed2@github.com> > Fix repeated typo `exeption` Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: 8283426: Fix 'exeption' typo Co-authored-by: Alexey Ivanov <70774172+aivanov-jdk at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7879/files - new: https://git.openjdk.java.net/jdk/pull/7879/files/1baca5ea..2ee3c98a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7879&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7879.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7879/head:pull/7879 PR: https://git.openjdk.java.net/jdk/pull/7879 From kbarrett at openjdk.java.net Wed Mar 23 20:12:03 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 23 Mar 2022 20:12:03 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> Message-ID: <8AWvwH-rVyRFTrCFuVQcgciW9QvIsHl0UPTRnLMnKUY=.c65800c0-61d6-496c-a8b9-aa84eb0ff90a@github.com> On Wed, 23 Mar 2022 14:42:06 GMT, Stefan Karlsson wrote: > We have at least three ways to check the type of a given Klass*: > 1) Using the Klass::_id field > 2) Using the layout helper > 3) Using the InstanceKlass::_kind field > > The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. > > The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). > > I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. Mostly looks good, subject to later nomenclature change as suggested by @rose00 . A couple comments that you can act on or not. src/hotspot/share/oops/instanceKlass.hpp line 139: > 137: > 138: protected: > 139: InstanceKlass(const ClassFileParser& parser, KlassID id = ID); I think I would prefer that KlassID was required. I don't know how much fanout that might have though. That preference is despite making construction of a concrete InstanceKlass different from the others. That's an artifact of InstanceKlass being overloaded as both a base class and a leaf class, a pattern that seems to often lead to trouble. src/hotspot/share/oops/oop.cpp line 141: > 139: bool oopDesc::is_array_noinline() const { return is_array(); } > 140: bool oopDesc::is_objArray_noinline() const { return is_objArray(); } > 141: bool oopDesc::is_typeArray_noinline() const { return is_typeArray(); } Is there a reason for this change that I'm not spotting? Oh, yes, there is. Buried in the whitespace formatting changes is the addition of `is_instanceRef_noinline`. Fortunately, I found github's "turn off whitespace differences" button. But I really dislike this kind of formatting, and esp. reformatting when combined with other changes. I think the formatting should have been left alone here and in similar places elsewhere. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7922 From amenkov at openjdk.java.net Thu Mar 24 01:39:09 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Thu, 24 Mar 2022 01:39:09 GMT Subject: RFR: 8283587: [BACKOUT] Invalid generic signature for redefined classes Message-ID: The change reverts the fix for JDK-8282241 which causes regression ------------- Commit messages: - Revert "8282241: Invalid generic signature for redefined classes" Changes: https://git.openjdk.java.net/jdk/pull/7934/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7934&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283587 Stats: 212 lines in 2 files changed: 7 ins; 202 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7934.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7934/head:pull/7934 PR: https://git.openjdk.java.net/jdk/pull/7934 From xgong at openjdk.java.net Thu Mar 24 01:53:49 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Thu, 24 Mar 2022 01:53:49 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 09:58:23 GMT, Xiaohong Gong wrote: >> The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. >> And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. >> >> To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: >> - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. >> - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. >> >> So with this patch, the following transformations are applied: >> >> For non-masked negation with NEON: >> >> movi v16.4s, #0x0 >> sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s >> >> and with SVE: >> >> mov z16.s, #0 >> sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s >> >> For masked negation with NEON: >> >> movi v17.4s, #0x1 >> mvn v19.16b, v18.16b >> mov v20.16b, v16.16b ==> neg v18.4s, v17.4s >> bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b >> add v19.4s, v20.4s, v17.4s >> mov v18.16b, v16.16b >> bsl v18.16b, v19.16b, v20.16b >> >> and with SVE: >> >> mov z16.s, #-1 >> mov z17.s, #1 ==> neg z16.s, p0/m, z16.s >> eor z18.s, p0/m, z18.s, z16.s >> add z18.s, p0/m, z18.s, z17.s >> >> Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): >> >> NEON: >> Benchmark Gain >> Byte128Vector.NEG 1.029 >> Byte128Vector.NEGMasked 1.757 >> Short128Vector.NEG 1.041 >> Short128Vector.NEGMasked 1.659 >> Int128Vector.NEG 1.005 >> Int128Vector.NEGMasked 1.513 >> Long128Vector.NEG 1.003 >> Long128Vector.NEGMasked 1.878 >> >> SVE with 512-bits: >> Benchmark Gain >> ByteMaxVector.NEG 1.10 >> ByteMaxVector.NEGMasked 1.165 >> ShortMaxVector.NEG 1.056 >> ShortMaxVector.NEGMasked 1.195 >> IntMaxVector.NEG 1.002 >> IntMaxVector.NEGMasked 1.239 >> LongMaxVector.NEG 1.031 >> LongMaxVector.NEGMasked 1.191 >> >> X86 (non AVX-512): >> Benchmark Gain >> ByteMaxVector.NEGMasked 1.254 >> ShortMaxVector.NEGMasked 1.359 >> IntMaxVector.NEGMasked 1.431 >> LongMaxVector.NEGMasked 1.989 >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Add a superclass for vector negation Hi @PaulSandoz @jatin-bhateja, could you please help to take a look at this PR? Thanks so much! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From lmesnik at openjdk.java.net Thu Mar 24 02:06:47 2022 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 24 Mar 2022 02:06:47 GMT Subject: RFR: 8283587: [BACKOUT] Invalid generic signature for redefined classes In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 01:31:42 GMT, Alex Menkov wrote: > The change reverts the fix for JDK-8282241 which causes regression Although I think it is a test problem, backout is fine. ------------- Marked as reviewed by lmesnik (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7934 From dcubed at openjdk.java.net Thu Mar 24 02:41:53 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 24 Mar 2022 02:41:53 GMT Subject: RFR: 8283587: [BACKOUT] Invalid generic signature for redefined classes In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 01:31:42 GMT, Alex Menkov wrote: > The change reverts the fix for JDK-8282241 which causes regression Thumbs up. Looks like a clean [BACKOUT] so this is a trivial fix. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7934 From sspitsyn at openjdk.java.net Thu Mar 24 03:18:47 2022 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Thu, 24 Mar 2022 03:18:47 GMT Subject: RFR: 8283587: [BACKOUT] Invalid generic signature for redefined classes In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 01:31:42 GMT, Alex Menkov wrote: > The change reverts the fix for JDK-8282241 which causes regression Hi Alex, The back out is clean. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7934 From amenkov at openjdk.java.net Thu Mar 24 04:44:44 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Thu, 24 Mar 2022 04:44:44 GMT Subject: Integrated: 8283587: [BACKOUT] Invalid generic signature for redefined classes In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 01:31:42 GMT, Alex Menkov wrote: > The change reverts the fix for JDK-8282241 which causes regression This pull request has now been integrated. Changeset: 5cf580e0 Author: Alex Menkov URL: https://git.openjdk.java.net/jdk/commit/5cf580e0fb57245c43c9c719b9b03baa323f2245 Stats: 212 lines in 2 files changed: 7 ins; 202 del; 3 mod 8283587: [BACKOUT] Invalid generic signature for redefined classes Reviewed-by: lmesnik, dcubed, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/7934 From stefank at openjdk.java.net Thu Mar 24 06:00:45 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 24 Mar 2022 06:00:45 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <8AWvwH-rVyRFTrCFuVQcgciW9QvIsHl0UPTRnLMnKUY=.c65800c0-61d6-496c-a8b9-aa84eb0ff90a@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> <8AWvwH-rVyRFTrCFuVQcgciW9QvIsHl0UPTRnLMnKUY=.c65800c0-61d6-496c-a8b9-aa84eb0ff90a@github.com> Message-ID: <-ncZVZBhYGm7XNB4N2GsMgYmv35zh8rt4QlUWgKku9Y=.5f847fd6-6e9f-4d74-becf-8b0d11801e8a@github.com> On Wed, 23 Mar 2022 19:30:35 GMT, Kim Barrett wrote: >> We have at least three ways to check the type of a given Klass*: >> 1) Using the Klass::_id field >> 2) Using the layout helper >> 3) Using the InstanceKlass::_kind field >> >> The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. >> >> The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). >> >> I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. > > src/hotspot/share/oops/instanceKlass.hpp line 139: > >> 137: >> 138: protected: >> 139: InstanceKlass(const ClassFileParser& parser, KlassID id = ID); > > I think I would prefer that KlassID was required. I don't know how much fanout that might have though. That preference is despite making construction of a concrete InstanceKlass different from the others. That's an artifact of InstanceKlass being overloaded as both a base class and a leaf class, a pattern that seems to often lead to trouble. I'll take a look at this when converting KlassID to KlassKind. ------------- PR: https://git.openjdk.java.net/jdk/pull/7922 From stefank at openjdk.java.net Thu Mar 24 06:06:51 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 24 Mar 2022 06:06:51 GMT Subject: Integrated: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> Message-ID: On Wed, 23 Mar 2022 14:42:06 GMT, Stefan Karlsson wrote: > We have at least three ways to check the type of a given Klass*: > 1) Using the Klass::_id field > 2) Using the layout helper > 3) Using the InstanceKlass::_kind field > > The Klass::_id field was something that was added when we rewrote the oop_oop_iterate dispatch mechanism, but the other mechanisms where left in place. > > The current Loom code uses both (2) and (3) every time a the code checks if an object is of type InstanceStackChunkKlass. In the Loom repository I intend to reduce that check to be a single test against the (1) field. To keep the code unified, and simpler, I changed all C++ Klass type checks to use (1). > > I propose that we upstream this change to the mainline, to slightly reduce the Loom diff. This pull request has now been integrated. Changeset: af18b111 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/af18b1111a7382a366d26ea1646282bdfb4ac495 Stats: 69 lines in 11 files changed: 4 ins; 29 del; 36 mod 8283574: Use Klass::_id for type checks in the C++ code Reviewed-by: tschatzl, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/7922 From stefank at openjdk.java.net Thu Mar 24 06:22:45 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 24 Mar 2022 06:22:45 GMT Subject: RFR: 8283574: Use Klass::_id for type checks in the C++ code In-Reply-To: <-ncZVZBhYGm7XNB4N2GsMgYmv35zh8rt4QlUWgKku9Y=.5f847fd6-6e9f-4d74-becf-8b0d11801e8a@github.com> References: <-pGSHfjtuVGJfjQOz_3U2IjoEy9cXTkLaNEHJ22NTm0=.6463d674-9d51-4ac2-881a-05b120a994fb@github.com> <8AWvwH-rVyRFTrCFuVQcgciW9QvIsHl0UPTRnLMnKUY=.c65800c0-61d6-496c-a8b9-aa84eb0ff90a@github.com> <-ncZVZBhYGm7XNB4N2GsMgYmv35zh8rt4QlUWgKku9Y=.5f847fd6-6e9f-4d74-becf-8b0d11801e8a@github.com> Message-ID: <4-DgXOWj5NJnt0-SGzKefn0Ej-O-1FAwlfUarKL2Y0g=.78e7b955-d4a0-4fd2-8705-9a75efa6aedc@github.com> On Thu, 24 Mar 2022 05:56:18 GMT, Stefan Karlsson wrote: >> src/hotspot/share/oops/instanceKlass.hpp line 139: >> >>> 137: >>> 138: protected: >>> 139: InstanceKlass(const ClassFileParser& parser, KlassID id = ID); >> >> I think I would prefer that KlassID was required. I don't know how much fanout that might have though. That preference is despite making construction of a concrete InstanceKlass different from the others. That's an artifact of InstanceKlass being overloaded as both a base class and a leaf class, a pattern that seems to often lead to trouble. > > I'll take a look at this when converting KlassID to KlassKind. I took a look. Making that changes makes allocate_instance_klass look worse. ------------- PR: https://git.openjdk.java.net/jdk/pull/7922 From stefank at openjdk.java.net Thu Mar 24 06:43:09 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 24 Mar 2022 06:43:09 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind Message-ID: During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. ------------- Commit messages: - 8283607: Rename KlassID to KlassKind Changes: https://git.openjdk.java.net/jdk/pull/7936/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7936&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283607 Stats: 62 lines in 14 files changed: 2 ins; 0 del; 60 mod Patch: https://git.openjdk.java.net/jdk/pull/7936.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7936/head:pull/7936 PR: https://git.openjdk.java.net/jdk/pull/7936 From dholmes at openjdk.java.net Thu Mar 24 06:57:49 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 24 Mar 2022 06:57:49 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 06:37:21 GMT, Stefan Karlsson wrote: > During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. Seems fine. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7936 From fyang at openjdk.java.net Thu Mar 24 07:01:43 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Thu, 24 Mar 2022 07:01:43 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v5] In-Reply-To: References: Message-ID: > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into JDK-8276799 - Fix copyright header - Address review comments - Merge remote-tracking branch 'upstream/master' into JDK-8276799 - 8276799: Implementation of JEP 422: Linux/RISC-V Port ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6294/files - new: https://git.openjdk.java.net/jdk/pull/6294/files/d8bef7fa..90db70eb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6294&range=03-04 Stats: 3082 lines in 147 files changed: 1635 ins; 374 del; 1073 mod Patch: https://git.openjdk.java.net/jdk/pull/6294.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6294/head:pull/6294 PR: https://git.openjdk.java.net/jdk/pull/6294 From fyang at openjdk.java.net Thu Mar 24 07:01:44 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Thu, 24 Mar 2022 07:01:44 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v4] In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 02:17:22 GMT, Vladimir Kozlov wrote: >> Fei Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright header > > Update looks good. > Testing results are also good. @vnkozlov @shipilev : Thanks for reviewing this :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From duke at openjdk.java.net Thu Mar 24 07:05:12 2022 From: duke at openjdk.java.net (KIRIYAMA Takuya) Date: Thu, 24 Mar 2022 07:05:12 GMT Subject: RFR: 8280761: UseCompressedOops should be set after limit_heap_by_allocatable_memory Message-ID: I fixed to set UseCompressedOops flag after limit_heap_by_allocatable_memory(). So when ulimit -v is called and -XX:MaxRAM is set, UseCompressedOops does not become false. And all hotspot tier1 test are passed. Would you please review this fix? ------------- Commit messages: - 8280761: UseCompressedOops should be set after limit_heap_by_allocatable_mem - 8280761: UseCompressedOops should be set after limit_heap_by_allocatable_memor Changes: https://git.openjdk.java.net/jdk/pull/7938/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7938&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280761 Stats: 127 lines in 2 files changed: 111 ins; 16 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7938.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7938/head:pull/7938 PR: https://git.openjdk.java.net/jdk/pull/7938 From tschatzl at openjdk.java.net Thu Mar 24 09:01:47 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 24 Mar 2022 09:01:47 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 06:37:21 GMT, Stefan Karlsson wrote: > During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7936 From eliu at openjdk.java.net Thu Mar 24 09:21:20 2022 From: eliu at openjdk.java.net (Eric Liu) Date: Thu, 24 Mar 2022 09:21:20 GMT Subject: RFR: 8282528: AArch64: Incorrect replicate2L_zero rule Message-ID: This patch fixes the wrong matching rule of replicate2L_zero. It was matched "ReplicateI" by mistake so that long immediates(not only zero) had to be moved to register first and matched to replicate2L finally. To fix this trivial bug, this patch fixes the typo and extends the rule of replicate2L_zero to replicate2L_imm, which now supports all possible long immediate values. The final code changes are shown as below: replicate2L_imm: mov x13, #0xff movk x13, #0xff, lsl #16 movk x13, #0xff, lsl #32 dup v16.2d, x13 => movi v16.2d, #0xff00ff00ff [Test] test/jdk/jdk/incubator/vector, test/hotspot/jtreg/compiler/vectorapi passed without failure. ------------- Commit messages: - 8282528: AArch64: Incorrect replicate2L_zero rule Changes: https://git.openjdk.java.net/jdk/pull/7939/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7939&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282528 Stats: 408 lines in 5 files changed: 350 ins; 7 del; 51 mod Patch: https://git.openjdk.java.net/jdk/pull/7939.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7939/head:pull/7939 PR: https://git.openjdk.java.net/jdk/pull/7939 From fyang at openjdk.java.net Thu Mar 24 09:26:53 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Thu, 24 Mar 2022 09:26:53 GMT Subject: Integrated: 8276799: Implementation of JEP 422: Linux/RISC-V Port In-Reply-To: References: Message-ID: <_xdzQDldAoyV0WBzaIMiT4tJWdNzOsP5JO-QuRhm2Z4=.da6e619a-2948-4886-878a-6feaa09e48c4@github.com> On Mon, 8 Nov 2021 11:17:47 GMT, Fei Yang wrote: > This PR implements JEP 422: Linux/RISC-V Port [1]. > The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. > > This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. > > [1] https://openjdk.java.net/jeps/422 This pull request has now been integrated. Changeset: 5905b02c Author: Fei Yang URL: https://git.openjdk.java.net/jdk/commit/5905b02c0e2643ae8d097562f181953f6c88fc89 Stats: 59141 lines in 188 files changed: 58964 ins; 54 del; 123 mod 8276799: Implementation of JEP 422: Linux/RISC-V Port Co-authored-by: Yadong Wang Co-authored-by: Yanhong Zhu Co-authored-by: Feilong Jiang Co-authored-by: Kun Wang Co-authored-by: Zhuxuan Ni Co-authored-by: Taiping Guo Co-authored-by: Kang He Co-authored-by: Aleksey Shipilev Co-authored-by: Xiaolin Zheng Co-authored-by: Kuai Wei Co-authored-by: Magnus Ihse Bursie Reviewed-by: ihse, dholmes, rriggs, kvn, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/6294 From aph at openjdk.java.net Thu Mar 24 09:45:52 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 24 Mar 2022 09:45:52 GMT Subject: RFR: 8282528: AArch64: Incorrect replicate2L_zero rule In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 09:14:16 GMT, Eric Liu wrote: > This patch fixes the wrong matching rule of replicate2L_zero. It was > matched "ReplicateI" by mistake so that long immediates(not only zero) > had to be moved to register first and matched to replicate2L finally. To > fix this trivial bug, this patch fixes the typo and extends the rule of > replicate2L_zero to replicate2L_imm, which now supports all possible > long immediate values. > > The final code changes are shown as below: > > replicate2L_imm: > > mov x13, #0xff > movk x13, #0xff, lsl #16 > movk x13, #0xff, lsl #32 > dup v16.2d, x13 > > => > > movi v16.2d, #0xff00ff00ff > > [Test] > test/jdk/jdk/incubator/vector, test/hotspot/jtreg/compiler/vectorapi > passed without failure. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1363: > 1361: tmp = tmp >> 8; > 1362: } > 1363: This logic should be in a separate function. ------------- PR: https://git.openjdk.java.net/jdk/pull/7939 From aph at openjdk.java.net Thu Mar 24 09:50:49 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 24 Mar 2022 09:50:49 GMT Subject: RFR: 8282528: AArch64: Incorrect replicate2L_zero rule In-Reply-To: References: Message-ID: <9oqsZeXhOEno92kEBr2zl_HvIV_JbEAynYx_dLVSP2g=.6d37ba20-d769-44b6-8a67-7e28c78afa8a@github.com> On Thu, 24 Mar 2022 09:42:32 GMT, Andrew Haley wrote: >> This patch fixes the wrong matching rule of replicate2L_zero. It was >> matched "ReplicateI" by mistake so that long immediates(not only zero) >> had to be moved to register first and matched to replicate2L finally. To >> fix this trivial bug, this patch fixes the typo and extends the rule of >> replicate2L_zero to replicate2L_imm, which now supports all possible >> long immediate values. >> >> The final code changes are shown as below: >> >> replicate2L_imm: >> >> mov x13, #0xff >> movk x13, #0xff, lsl #16 >> movk x13, #0xff, lsl #32 >> dup v16.2d, x13 >> >> => >> >> movi v16.2d, #0xff00ff00ff >> >> [Test] >> test/jdk/jdk/incubator/vector, test/hotspot/jtreg/compiler/vectorapi >> passed without failure. > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1363: > >> 1361: tmp = tmp >> 8; >> 1362: } >> 1363: > > This logic should be in a separate function. I think we need a `can_encode(imm, arrangement)` function. ------------- PR: https://git.openjdk.java.net/jdk/pull/7939 From aph at openjdk.java.net Thu Mar 24 09:50:49 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 24 Mar 2022 09:50:49 GMT Subject: RFR: 8282528: AArch64: Incorrect replicate2L_zero rule In-Reply-To: <9oqsZeXhOEno92kEBr2zl_HvIV_JbEAynYx_dLVSP2g=.6d37ba20-d769-44b6-8a67-7e28c78afa8a@github.com> References: <9oqsZeXhOEno92kEBr2zl_HvIV_JbEAynYx_dLVSP2g=.6d37ba20-d769-44b6-8a67-7e28c78afa8a@github.com> Message-ID: On Thu, 24 Mar 2022 09:46:12 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1363: >> >>> 1361: tmp = tmp >> 8; >>> 1362: } >>> 1363: >> >> This logic should be in a separate function. > > I think we need a `can_encode(imm, arrangement)` function. And then another function that actually does the arranging, and the generation of instructions calls those functions. ------------- PR: https://git.openjdk.java.net/jdk/pull/7939 From eliu at openjdk.java.net Thu Mar 24 12:16:47 2022 From: eliu at openjdk.java.net (Eric Liu) Date: Thu, 24 Mar 2022 12:16:47 GMT Subject: RFR: 8282528: AArch64: Incorrect replicate2L_zero rule In-Reply-To: References: <9oqsZeXhOEno92kEBr2zl_HvIV_JbEAynYx_dLVSP2g=.6d37ba20-d769-44b6-8a67-7e28c78afa8a@github.com> Message-ID: On Thu, 24 Mar 2022 09:47:06 GMT, Andrew Haley wrote: >> I think we need a `can_encode(imm, arrangement)` function. > > And then another function that actually does the arranging, and the generation of instructions calls those functions. Thanks for your review. I agree with that `can_encode(imm, arrangment)` function is better. My concern is that this JBS is just a bug fix for replicate2L_imm backend, and for other SIMD_Arrangment, I found that they can have some other choice for the code generation, but I didn?t touch them in this patch to keep it clear and small. I list two examples below. Example1: movi v16.4s, #0x34 orr v16.4s, #0x12, lsl #8 vs mov w8, #0x1234 dup v16.4s, w8 Example2: movi v16.4s, #0x78 orr v16.4s, #0x56, lsl #8 orr v16.4s, #0x34, lsl #16 orr v16.4s, #0x12, lsl #24 vs mov w14, #0x5678 movk w14, #0x1234, lsl #16 dup v16.4s, w14 I'm considering to measure the performance and refine the mov macro assembler if it's necessary. `can_encode` can also be done in the refined work. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/7939 From stefank at openjdk.java.net Thu Mar 24 13:12:28 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 24 Mar 2022 13:12:28 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind [v2] In-Reply-To: References: Message-ID: > During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Fix missing rename bug ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7936/files - new: https://git.openjdk.java.net/jdk/pull/7936/files/31718879..5a3ab9ba Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7936&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7936&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7936.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7936/head:pull/7936 PR: https://git.openjdk.java.net/jdk/pull/7936 From jvernee at openjdk.java.net Thu Mar 24 13:16:53 2022 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 24 Mar 2022 13:16:53 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 14:06:56 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Drop redundant javadoc statements re. handling of nulls > (handling of nulls is specified once and for all in the package javadoc) Some more nits. One potential issue with adding --enable-preview when building benchmarks (last comment of the bunch). Other than that, I think this looks good. make/test/BuildMicrobenchmark.gmk line 97: > 95: SRC := $(MICROBENCHMARK_SRC), \ > 96: BIN := $(MICROBENCHMARK_CLASSES), \ > 97: JAVAC_FLAGS := --add-exports java.base/sun.security.util=ALL-UNNAMED --enable-preview, \ It still seems like this would lead to potential issues. i.e. requiring all benchmarks to be run with `--enable-preview`? We ended up adding `--enable-preview` to our benchmarks, but do other benchmarks still work without it? AFAIK the entire benchmarks.jar will have the altered class file version. src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 61: > 59: *

  • {@linkplain MemorySegment#allocateNative(long, long, MemorySession) native memory segments}, backed by off-heap memory;
  • > 60: *
  • {@linkplain FileChannel#map(FileChannel.MapMode, long, long, MemorySession) mapped memory segments}, obtained by mapping > 61: * a file into main memory ({@code mmap}); tha contents of a mapped memory segments can be {@linkplain #force() persisted} and Suggestion: * a file into main memory ({@code mmap}); the contents of a mapped memory segments can be {@linkplain #force() persisted} and src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 298: > 296: > 297: /** > 298: * Returns a slice of this memory segment, at given offset. The returned segment's base address is the base address Saw a similar change in other places, so I'll suggest this here as well. Suggestion: * Returns a slice of this memory segment, at the given offset. The returned segment's base address is the base address src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 311: > 309: > 310: /** > 311: * Returns a slice of this memory segment, at given offset. The returned segment's base address is the base address Suggestion: * Returns a slice of this memory segment, at the given offset. The returned segment's base address is the base address src/java.base/share/classes/java/lang/foreign/MemorySession.java line 143: > 141: > 142: /** > 143: * {@return the owner thread associated with this memory session (if any)} Maybe the "if any" here could be more specific. e.g. saying that `null` is returned if the session doesn't have an owner thread. src/java.base/share/classes/java/lang/foreign/MemorySession.java line 165: > 163: > 164: /** > 165: * Closes this memory session. As a side effect, if this operation completes without exceptions, this session I'd suggest change this to "As a result of this", since the effects listed are the main reason for closing a session. (it strikes me as strange. If the things listed are side-effects, then what is the main effect of closing a segment?) Suggestion: * Closes this memory session. As a result of this, if this operation completes without exceptions, this session src/java.base/share/classes/java/lang/foreign/SymbolLookup.java line 51: > 49: *

    > 50: * Clients can obtain a {@linkplain #loaderLookup() loader lookup}, > 51: * which can be used to search symbols in libraries loaded by the current classloader (e.g. using {@link System#load(String)}, "search symbols" sounds a bit unnatural to me... I like the wording in the libraryLookup doc more Suggestion: * which can be used to find symbols in libraries loaded by the current classloader (e.g. using {@link System#load(String)}, src/java.base/share/classes/java/lang/foreign/SymbolLookup.java line 59: > 57: *

    > 58: * Finally, clients can load a library and obtain a {@linkplain #libraryLookup(Path, MemorySession) library lookup} which can be used > 59: * to search symbols in that library. A library lookup is associated with a {@linkplain MemorySession memory session}, Suggestion: * to find symbols in that library. A library lookup is associated with a {@linkplain MemorySession memory session}, src/java.base/share/classes/java/lang/invoke/MethodHandles.java line 7895: > 7893: * VarHandle handle = MethodHandles.memorySegmentViewVarHandle(ValueLayout.JAVA_INT.withOrder(ByteOrder.BIG_ENDIAN)); //(MemorySegment, long) -> int > 7894: * handle = MethodHandles.insertCoordinates(handle, 1, 4); //(MemorySegment) -> int > 7895: * } These could be snippets. Also, I think it would be nice to add a link to MemoryLayout.varHandle here. src/java.base/share/classes/java/nio/channels/FileChannel.java line 975: > 973: /** > 974: * Maps a region of this channel's file into a new mapped memory segment, > 975: * with a given offset, size and memory session. Suggestion: * with the given offset, size and memory session. src/java.base/share/classes/jdk/internal/foreign/SystemLookup.java line 51: > 49: > 50: /* A fallback lookup, used when creation of system lookup fails. */ > 51: private static final Function> fallbackLookup = name -> Optional.empty(); Now that we have SymbolLookup again, these Function types could potentially be changed to SymbolLookup again. (and also avoid some churn here) src/java.base/share/classes/jdk/internal/foreign/SystemLookup.java line 135: > 133: } > 134: > 135: public Optional lookup(String name) { `@Override` here? src/java.base/share/classes/sun/nio/ch/FileChannelImpl.java line 1071: > 1069: sessionImpl.checkValidStateSlow(); > 1070: if (offset < 0) throw new IllegalArgumentException("Requested bytes offset must be >= 0."); > 1071: if (size < 0) throw new IllegalArgumentException("Requested bytes size must be >= 0."); The javadoc also says that IAE will be thrown if `offset + size < 0` I think to guard against overflow, but I don't see that checked here. Is it missing? ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From bulasevich at openjdk.java.net Thu Mar 24 13:26:43 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 24 Mar 2022 13:26:43 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v7] In-Reply-To: References: Message-ID: > Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. > > In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. > > As a side effect, the performance of some tests is slightly improved: > ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` > > Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7517/files - new: https://git.openjdk.java.net/jdk/pull/7517/files/9650abc9..33e85d2c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=05-06 Stats: 19 lines in 6 files changed: 6 ins; 4 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/7517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517 PR: https://git.openjdk.java.net/jdk/pull/7517 From bulasevich at openjdk.java.net Thu Mar 24 13:26:45 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 24 Mar 2022 13:26:45 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: <5swa6Sh7ZklCS1YoTwK3ElGLzEN8-pYW-kTX2bfGDLc=.ddc846c4-34dc-4e81-bd57-6d388ef6c6d6@github.com> References: <5swa6Sh7ZklCS1YoTwK3ElGLzEN8-pYW-kTX2bfGDLc=.ddc846c4-34dc-4e81-bd57-6d388ef6c6d6@github.com> Message-ID: <-NVzOOAEOmj4RtMmewtS9QcI5UNgvn3skfFZ5PLszSQ=.e7c26731-58d9-47f5-a81a-d80d76623184@github.com> On Thu, 17 Mar 2022 17:36:16 GMT, Volker Simonis wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> rename, adding test > > src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 56: > >> 54: __ ldr(rscratch2, l); >> 55: int jump_code_size = __ far_jump(ExternalAddress(entry_point)); >> 56: // IC stub code size is not expected to vary depending on target address. > > Does the new code still align `cached_value` on a `wordSize` boundary as this was ensured before by `align(wordsize)`? I think that's only true if `code_begin` is guaranteed to start at a `wordSize` boundary because `far_jump` is either one or three instructions (plus one `ldr` instruction). If yes, please add a comment explaining that. Otherwise explain why the alignment isn't necessary anymore. Thanks for pointing this out. I think alignment is important because of data load penalty issues and MT issues. Actually the stub is aligned on the CodeEntryAlignment and stub_size is aligned, so this address is also aligned. I removed align(wordSize) because the stub_size is constant and there is no room for additional alignments within a stub. Let me add an assert here to check the alignment. > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 428: > >> 426: uint64_t offset; >> 427: // We can use ADRP here because we know that the total size of >> 428: // the code cache cannot exceed 2Gb. > > Not directly related to your change, but what's correct here: > - the comment which says "code cache can't exceed 2gb" > - the assertion above which asserts `ReservedCodeCacheSize < 4*G` > > Maybe you can fix this while you're on it? ADRP limit is 4GB - it is checked by assert. The comment reminds us that CODE_CACHE_SIZE_LIMIT (defined in globalDefinitions.hpp) is 2G which is Ok for us. let me update comment a little: + // the code cache cannot exceed 2Gb (ADRP limit is 4GB) ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From bulasevich at openjdk.java.net Thu Mar 24 13:30:51 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 24 Mar 2022 13:30:51 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: References: <5swa6Sh7ZklCS1YoTwK3ElGLzEN8-pYW-kTX2bfGDLc=.ddc846c4-34dc-4e81-bd57-6d388ef6c6d6@github.com> Message-ID: On Fri, 18 Mar 2022 12:58:01 GMT, Evgeny Astigeevich wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 399: >> >>> 397: } >>> 398: // codecache size: 128M..240M >>> 399: return !CodeCache::is_non_nmethod(addr); >> >> Is it possible to further refine this to also catch calls from C1 to C1 and C2 to C2 which obviously wouldn't need a far call as well? > > I believe they should be our next steps to guarantee we don't generate redundant code for such cases. Good catch. But I see a little problem here. CodeCache is aksed for a room in profiled segment for C1 methods, but it places method to the non_profiled segment when the profiled segment is full (and same whay C2 method can be placed in profiled segment). So we can not know in advance where will be the final place for the generated method. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From bulasevich at openjdk.java.net Thu Mar 24 13:36:27 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 24 Mar 2022 13:36:27 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: > Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH. > > In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. > > As a side effect, the performance of some tests is slightly improved: > ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` > > Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge branch 'openjdk:master' into codecache_segments_order - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled - rename, adding test - moving nops out of far_jump - minor renaming - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func - fix name: is_non_nmethod, adding target_needs_far_branch func - change codecache segments order: nonprofiled-nonmethod-profiled increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M ------------- Changes: https://git.openjdk.java.net/jdk/pull/7517/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=07 Stats: 203 lines in 7 files changed: 185 ins; 1 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/7517.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517 PR: https://git.openjdk.java.net/jdk/pull/7517 From bulasevich at openjdk.java.net Thu Mar 24 14:16:50 2022 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Thu, 24 Mar 2022 14:16:50 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v6] In-Reply-To: References: Message-ID: <9S_V0PqGiqRBDPj4JzjCQmU7gVk_heoYXUV-WRTObs4=.d80a386f-cf88-4d89-8322-87bfddf0cc7f@github.com> On Wed, 16 Mar 2022 14:36:07 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> rename, adding test > > src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 58: > >> 56: // IC stub code size is not expected to vary depending on target address. >> 57: // We use NOPs to make the ldr+far_jump+int64 size equal to ic_stub_code_size. >> 58: for (int i = jump_code_size; i < ic_stub_code_size() - 12; i += 4) { > > 12 == 3 * NativeInstruction::instruction_size > 4 == NativeInstruction::instruction_size Ok. thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From aph at openjdk.java.net Thu Mar 24 15:11:47 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 24 Mar 2022 15:11:47 GMT Subject: RFR: 8282528: AArch64: Incorrect replicate2L_zero rule In-Reply-To: References: <9oqsZeXhOEno92kEBr2zl_HvIV_JbEAynYx_dLVSP2g=.6d37ba20-d769-44b6-8a67-7e28c78afa8a@github.com> Message-ID: <5QRJrwf_oAuDSwJL4dzUy42VhsLRbEj6eQ4SOv0ha_8=.204f8e99-b536-4c15-a10b-a7d8cf52cec0@github.com> On Thu, 24 Mar 2022 12:13:51 GMT, Eric Liu wrote: >> And then another function that actually does the arranging, and the generation of instructions calls those functions. > > Thanks for your review. I agree with that `can_encode(imm, arrangment)` function is better. My concern is that this JBS is just a bug fix for replicate2L_imm backend, and for other SIMD_Arrangment, I found that they can have some other choice for the code generation, but I didn?t touch them in this patch to keep it clear and small. I show two examples below. > > Example1: > > movi v16.4s, #0x34 > orr v16.4s, #0x12, lsl #8 > > vs > > mov w8, #0x1234 > dup v16.4s, w8 > > > Example2: > > movi v16.4s, #0x78 > orr v16.4s, #0x56, lsl #8 > orr v16.4s, #0x34, lsl #16 > orr v16.4s, #0x12, lsl #24 > > vs > > mov w14, #0x5678 > movk w14, #0x1234, lsl #16 > dup v16.4s, w14 > > > I'm considering to measure the performance and refine the mov macro assembler if it's necessary. `can_encode` can also be done in the refined work. What do you think? Sure. I'm looking at Neoverse V1 Optimization Guide, which suggests a fairly high cost for core - SIMD moves, and also only a 2 (of 4) of the SIMD pipelines can communicate with the integer registers. So I've got an idea. Please feel free to do any reorganization later, if you like. It's just that the current organization makes it hard to follow, and thus hard to review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7939 From jbhateja at openjdk.java.net Thu Mar 24 15:47:45 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 24 Mar 2022 15:47:45 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v18] In-Reply-To: References: Message-ID: <6Zb5Gbl88cEL8Ev2KqnwTpIwiUHTyijr1lOcc1sHrso=.ba487d91-0064-452a-b140-9c6b3022a221@github.com> On Wed, 23 Mar 2022 06:55:50 GMT, Tobias Hartmann wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: >> >> - 8279508: Using an explicit scratch register since rscratch1 is bound to r10 and its usage is transparent to compiler. >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 >> - 8279508: Windows build failure fix. >> - 8279508: Styling comments resolved. >> - 8279508: Creating separate test for round double under feature check. >> - 8279508: Reducing the invocation count and compile thresholds for RoundTests.java. >> - 8279508: Review comments resolution. >> - 8279508: Preventing domain switch-over penalty for Math.round(float) and constraining unrolling to prevent code bloating. >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508 >> - 8279508: Removing +LogCompilation flag. >> - ... and 12 more: https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf > > All tests passed. Hi @TobiHartmann , thanks for confirming. Hi @jddarcy , @theRealAph , kindly let me know if its good to integrate this. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From tschatzl at openjdk.java.net Thu Mar 24 16:55:55 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 24 Mar 2022 16:55:55 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind [v2] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:12:28 GMT, Stefan Karlsson wrote: >> During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix missing rename bug Still good. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7936 From mcimadamore at openjdk.java.net Thu Mar 24 17:57:16 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 17:57:16 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with three additional commits since the last revision: - Update src/java.base/share/classes/java/lang/foreign/MemorySegment.java Co-authored-by: Jorn Vernee - Update src/java.base/share/classes/java/lang/foreign/MemorySegment.java Co-authored-by: Jorn Vernee - Update src/java.base/share/classes/java/lang/foreign/MemorySegment.java Co-authored-by: Jorn Vernee ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/c9bc9a70..95f65eea Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=04-05 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 17:57:19 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 17:57:19 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:10:20 GMT, Jorn Vernee wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Drop redundant javadoc statements re. handling of nulls >> (handling of nulls is specified once and for all in the package javadoc) > > make/test/BuildMicrobenchmark.gmk line 97: > >> 95: SRC := $(MICROBENCHMARK_SRC), \ >> 96: BIN := $(MICROBENCHMARK_CLASSES), \ >> 97: JAVAC_FLAGS := --add-exports java.base/sun.security.util=ALL-UNNAMED --enable-preview, \ > > It still seems like this would lead to potential issues. i.e. requiring all benchmarks to be run with `--enable-preview`? We ended up adding `--enable-preview` to our benchmarks, but do other benchmarks still work without it? AFAIK the entire benchmarks.jar will have the altered class file version. Sure, this is problematic - but at the same time I don't think there's a better way to deal with this? I'd prefer to defer this to a separate issue (and I think the build team is in a much better position to suggest a better fix). IIRC we had this problem in the past as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 18:19:09 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 18:19:09 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v7] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/95f65eea..3e8cfd74 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=05-06 Stats: 16 lines in 3 files changed: 2 ins; 0 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 18:19:12 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 18:19:12 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:00:12 GMT, Jorn Vernee wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Drop redundant javadoc statements re. handling of nulls >> (handling of nulls is specified once and for all in the package javadoc) > > src/java.base/share/classes/sun/nio/ch/FileChannelImpl.java line 1071: > >> 1069: sessionImpl.checkValidStateSlow(); >> 1070: if (offset < 0) throw new IllegalArgumentException("Requested bytes offset must be >= 0."); >> 1071: if (size < 0) throw new IllegalArgumentException("Requested bytes size must be >= 0."); > > The javadoc also says that IAE will be thrown if `offset + size < 0` I think to guard against overflow, but I don't see that checked here. Is it missing? `mapInternal` in FileChannelImpl takes care of that for both flavors of `map` ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From jvernee at openjdk.java.net Thu Mar 24 18:38:44 2022 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 24 Mar 2022 18:38:44 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 17:48:23 GMT, Maurizio Cimadamore wrote: >> make/test/BuildMicrobenchmark.gmk line 97: >> >>> 95: SRC := $(MICROBENCHMARK_SRC), \ >>> 96: BIN := $(MICROBENCHMARK_CLASSES), \ >>> 97: JAVAC_FLAGS := --add-exports java.base/sun.security.util=ALL-UNNAMED --enable-preview, \ >> >> It still seems like this would lead to potential issues. i.e. requiring all benchmarks to be run with `--enable-preview`? We ended up adding `--enable-preview` to our benchmarks, but do other benchmarks still work without it? AFAIK the entire benchmarks.jar will have the altered class file version. > > Sure, this is problematic - but at the same time I don't think there's a better way to deal with this? I'd prefer to defer this to a separate issue (and I think the build team is in a much better position to suggest a better fix). IIRC we had this problem in the past as well. I'd suggest at least adding `--enable-preview` as an argument when running benchmarks through the build system in that case. I think this should do the trick: diff --git a/make/RunTests.gmk b/make/RunTests.gmk index 81540266ec0..9ed45fb02a8 100644 --- a/make/RunTests.gmk +++ b/make/RunTests.gmk @@ -583,7 +583,7 @@ define SetupRunMicroTestBody $$(eval $$(call SetMicroValue,$1,MICRO_JAVA_OPTIONS)) # Current tests needs to open java.io - $1_MICRO_JAVA_OPTIONS += --add-opens=java.base/java.io=ALL-UNNAMED + $1_MICRO_JAVA_OPTIONS += --add-opens=java.base/java.io=ALL-UNNAMED --enable-preview # Save output as JSON or CSV file ifneq ($$(MICRO_RESULTS_FORMAT), ) People manually running the benchmarks.jar will have to pass `--enable-preview` still though. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 18:59:11 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 18:59:11 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v8] In-Reply-To: References: Message-ID: <6yP7vgmmPmX36zl3Lp0Dxw48sxlimGcfvOI0YrM4Bt0=.86078fb2-182d-44a1-b861-a7a234073f40@github.com> > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with two additional commits since the last revision: - Update src/java.base/share/classes/java/lang/foreign/SymbolLookup.java Co-authored-by: Jorn Vernee - Update src/java.base/share/classes/java/lang/foreign/SymbolLookup.java Co-authored-by: Jorn Vernee ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/3e8cfd74..6881b6dc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=06-07 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 19:05:35 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 19:05:35 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v9] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address more review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/6881b6dc..d95c6d0f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=07-08 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 19:12:01 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 19:12:01 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v10] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Add --enable-preview to micro benchmark java options ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/d95c6d0f..6e7189b4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=08-09 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From aivanov at openjdk.java.net Thu Mar 24 19:25:53 2022 From: aivanov at openjdk.java.net (Alexey Ivanov) Date: Thu, 24 Mar 2022 19:25:53 GMT Subject: RFR: 8283426: Fix 'exeption' typo [v4] In-Reply-To: <48NxGvP43iL5AnO2fxDF1vmpTZ6RxcQp4x6oaHxkEOQ=.ccd35bda-80d8-40da-8a89-bb004adc2ed2@github.com> References: <48NxGvP43iL5AnO2fxDF1vmpTZ6RxcQp4x6oaHxkEOQ=.ccd35bda-80d8-40da-8a89-bb004adc2ed2@github.com> Message-ID: On Wed, 23 Mar 2022 19:41:58 GMT, Andrey Turbanov wrote: >> Fix repeated typo `exeption` > > Andrey Turbanov has updated the pull request incrementally with one additional commit since the last revision: > > 8283426: Fix 'exeption' typo > > Co-authored-by: Alexey Ivanov <70774172+aivanov-jdk at users.noreply.github.com> Marked as reviewed by aivanov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7879 From mcimadamore at openjdk.java.net Thu Mar 24 19:27:21 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 19:27:21 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v11] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Revert changes to RunTests.gmk ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/6e7189b4..504b564a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=09-10 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From jvernee at openjdk.java.net Thu Mar 24 19:27:22 2022 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 24 Mar 2022 19:27:22 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v11] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 19:19:34 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to RunTests.gmk Looks Good! ------------- Marked as reviewed by jvernee (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7888 From jvernee at openjdk.java.net Thu Mar 24 19:27:23 2022 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 24 Mar 2022 19:27:23 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 18:35:12 GMT, Jorn Vernee wrote: >> Sure, this is problematic - but at the same time I don't think there's a better way to deal with this? I'd prefer to defer this to a separate issue (and I think the build team is in a much better position to suggest a better fix). IIRC we had this problem in the past as well. > > I'd suggest at least adding `--enable-preview` as an argument when running benchmarks through the build system in that case. I think this should do the trick: > > > diff --git a/make/RunTests.gmk b/make/RunTests.gmk > index 81540266ec0..9ed45fb02a8 100644 > --- a/make/RunTests.gmk > +++ b/make/RunTests.gmk > @@ -583,7 +583,7 @@ define SetupRunMicroTestBody > $$(eval $$(call SetMicroValue,$1,MICRO_JAVA_OPTIONS)) > > # Current tests needs to open java.io > - $1_MICRO_JAVA_OPTIONS += --add-opens=java.base/java.io=ALL-UNNAMED > + $1_MICRO_JAVA_OPTIONS += --add-opens=java.base/java.io=ALL-UNNAMED --enable-preview > > # Save output as JSON or CSV file > ifneq ($$(MICRO_RESULTS_FORMAT), ) > > > People manually running the benchmarks.jar will have to pass `--enable-preview` still though. After discussing this offline, it seems that javac no longer poisons the minor class file version of every class file, but only of those that use preview features. So, my concern is not warranted. ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Thu Mar 24 19:27:24 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 24 Mar 2022 19:27:24 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 19:17:40 GMT, Jorn Vernee wrote: >> I'd suggest at least adding `--enable-preview` as an argument when running benchmarks through the build system in that case. I think this should do the trick: >> >> >> diff --git a/make/RunTests.gmk b/make/RunTests.gmk >> index 81540266ec0..9ed45fb02a8 100644 >> --- a/make/RunTests.gmk >> +++ b/make/RunTests.gmk >> @@ -583,7 +583,7 @@ define SetupRunMicroTestBody >> $$(eval $$(call SetMicroValue,$1,MICRO_JAVA_OPTIONS)) >> >> # Current tests needs to open java.io >> - $1_MICRO_JAVA_OPTIONS += --add-opens=java.base/java.io=ALL-UNNAMED >> + $1_MICRO_JAVA_OPTIONS += --add-opens=java.base/java.io=ALL-UNNAMED --enable-preview >> >> # Save output as JSON or CSV file >> ifneq ($$(MICRO_RESULTS_FORMAT), ) >> >> >> People manually running the benchmarks.jar will have to pass `--enable-preview` still though. > > After discussing this offline, it seems that javac no longer poisons the minor class file version of every class file, but only of those that use preview features. So, my concern is not warranted. Turns out this is no longer necessary. As part of the support for preview API, javac now only pollutes the classfile if a source file is using preview features, as described in this PR: https://github.com/openjdk/jdk/pull/703 ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From aturbanov at openjdk.java.net Thu Mar 24 19:54:48 2022 From: aturbanov at openjdk.java.net (Andrey Turbanov) Date: Thu, 24 Mar 2022 19:54:48 GMT Subject: Integrated: 8283426: Fix 'exeption' typo In-Reply-To: References: Message-ID: On Sun, 20 Mar 2022 13:30:01 GMT, Andrey Turbanov wrote: > Fix repeated typo `exeption` This pull request has now been integrated. Changeset: dc5a65ab Author: Andrey Turbanov URL: https://git.openjdk.java.net/jdk/commit/dc5a65ab378f0780f7760965f2b52cbbd7c62aad Stats: 38 lines in 17 files changed: 0 ins; 2 del; 36 mod 8283426: Fix 'exeption' typo Reviewed-by: xuelei, iris, dholmes, wetmore, aivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/7879 From duke at openjdk.java.net Thu Mar 24 21:42:51 2022 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 24 Mar 2022 21:42:51 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M lgtm ------------- Marked as reviewed by eastig at github.com (no known OpenJDK username). PR: https://git.openjdk.java.net/jdk/pull/7517 From iklam at openjdk.java.net Fri Mar 25 00:17:16 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 25 Mar 2022 00:17:16 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() Message-ID: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` - Call the appropriate parsing function according to the type of the flag - Added more test cases for flags of the `double` type. As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. Tested with tiers 1-5. ------------- Commit messages: - Merge branch 'master' of https://github.com/openjdk/jdk into 8283013-simplify-parse-argument - disallow NAN and INF - 8283013: Simplify Arguments::parse_argument() Changes: https://git.openjdk.java.net/jdk/pull/7916/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283013 Stats: 210 lines in 3 files changed: 96 ins; 73 del; 41 mod Patch: https://git.openjdk.java.net/jdk/pull/7916.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7916/head:pull/7916 PR: https://git.openjdk.java.net/jdk/pull/7916 From dholmes at openjdk.java.net Fri Mar 25 07:15:47 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 25 Mar 2022 07:15:47 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind [v2] In-Reply-To: References: Message-ID: <2y-ojwlTUEL-JVwFpjYipNAX4rJWp4ivKv0bNIdR6Tw=.fc5e634b-551a-40ea-a1c5-ad316ef0c654@github.com> On Thu, 24 Mar 2022 13:12:28 GMT, Stefan Karlsson wrote: >> During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix missing rename bug Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7936 From stefank at openjdk.java.net Fri Mar 25 08:21:43 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 25 Mar 2022 08:21:43 GMT Subject: RFR: 8283607: Rename KlassID to KlassKind [v2] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:12:28 GMT, Stefan Karlsson wrote: >> During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix missing rename bug Thanks for reviewing. ------------- PR: https://git.openjdk.java.net/jdk/pull/7936 From stefank at openjdk.java.net Fri Mar 25 08:21:44 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 25 Mar 2022 08:21:44 GMT Subject: Integrated: 8283607: Rename KlassID to KlassKind In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 06:37:21 GMT, Stefan Karlsson wrote: > During the review of JDK-8283574 / #7922 , it was brought up that the term ID to describe the "kind" of *Klass is misleading. Rename KlassID to KlassKind, and update names of variables of KlassID. This pull request has now been integrated. Changeset: 636225b8 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/636225b8151d1bd53349a314fb50b682d6effcd2 Stats: 62 lines in 14 files changed: 2 ins; 0 del; 60 mod 8283607: Rename KlassID to KlassKind Reviewed-by: dholmes, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/7936 From jiefu at openjdk.java.net Fri Mar 25 09:07:45 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 25 Mar 2022 09:07:45 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> Message-ID: On Sat, 19 Mar 2022 03:11:12 GMT, Jie Fu wrote: >>> Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: >>> >>> double negate(double arg) {return -0.0 - arg; } >>> >>> This is to handle signed zeros correctly. >> >> Hi @jddarcy ,thanks for looking at this PR and thanks for the notes on the floating point negation! Yeah, this really makes sense to me. Kindly note that this patch didn't touch the negation of the floating point values. For Vector API, the vector floating point negation has been intrinsified to `NegVF/D` node by compiler that we directly generate the negation instructions for them. Thanks! > >> Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: >> >> double negate(double arg) {return -0.0 - arg; } >> >> This is to handle signed zeros correctly. > > This seems easy to be broken by an opt enhancement. > Just wondering do we have a jtreg test for this point? @jddarcy > Thanks. > Hi @DamonFool , thanks for your review! All the comments have been addressed. Thanks! Thanks @XiaohongGong for the update. And sorry for the late (just a little busy this week). I'll do some testing and feedback here. ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From kbarrett at openjdk.java.net Sat Mar 26 21:50:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 26 Mar 2022 21:50:37 GMT Subject: Integrated: 8263134: HotSpot Style Guide should disallow inheriting constructors In-Reply-To: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> References: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Message-ID: On Fri, 4 Mar 2022 15:04:47 GMT, Kim Barrett wrote: > Please review this change to explicitly disallow the use of inheriting > constructors: > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2540.htm). > > The C++11/14 specification has a lot of problems. These were addressed in > C++17 (and as a DR that affects C++11/14): > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0136r1.html). > > Use of inheriting constructors now runs the risk of encountering those bugs, > inconsistent behavior between different compilers or compiler versions, and > behavior changes for future support of C++17. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This pull request has now been integrated. Changeset: b0daf70a Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/b0daf70a251ba0ca04ca757b98cffd5607a154d4 Stats: 29 lines in 2 files changed: 29 ins; 0 del; 0 mod 8263134: HotSpot Style Guide should disallow inheriting constructors Reviewed-by: dholmes, dcubed, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/7698 From kbarrett at openjdk.java.net Sat Mar 26 21:50:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 26 Mar 2022 21:50:37 GMT Subject: RFR: 8263134: HotSpot Style Guide should disallow inheriting constructors [v2] In-Reply-To: References: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Message-ID: On Wed, 23 Mar 2022 16:07:11 GMT, Vladimir Kozlov wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into no_inherited_ctors >> - update html >> - forbid inherited ctors > > Approved. Thanks @vnkozlov , @dholmes-ora , and @dcubed-ojdk for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7698 From kbarrett at openjdk.java.net Sat Mar 26 21:50:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 26 Mar 2022 21:50:37 GMT Subject: RFR: 8263134: HotSpot Style Guide should disallow inheriting constructors [v2] In-Reply-To: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> References: <0ydC7yufiVJTFlvJU6SXM5Gq5vTGdo2FPCJ4XOXpF5U=.2f611e00-b3d4-4ccb-9658-6eaa1d6cae5d@github.com> Message-ID: > Please review this change to explicitly disallow the use of inheriting > constructors: > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2540.htm). > > The C++11/14 specification has a lot of problems. These were addressed in > C++17 (and as a DR that affects C++11/14): > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0136r1.html). > > Use of inheriting constructors now runs the risk of encountering those bugs, > inconsistent behavior between different compilers or compiler versions, and > behavior changes for future support of C++17. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into no_inherited_ctors - update html - forbid inherited ctors ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7698/files - new: https://git.openjdk.java.net/jdk/pull/7698/files/3e2be799..eddebb35 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7698&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7698&range=00-01 Stats: 130513 lines in 1875 files changed: 96087 ins; 28343 del; 6083 mod Patch: https://git.openjdk.java.net/jdk/pull/7698.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7698/head:pull/7698 PR: https://git.openjdk.java.net/jdk/pull/7698 From kbarrett at openjdk.java.net Sat Mar 26 21:59:20 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 26 Mar 2022 21:59:20 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions [v2] In-Reply-To: References: Message-ID: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into unrestricted-union - update html - unrestricted unions ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7704/files - new: https://git.openjdk.java.net/jdk/pull/7704/files/72ea8fc4..9f49b104 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7704&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7704&range=00-01 Stats: 130542 lines in 1875 files changed: 96116 ins; 28343 del; 6083 mod Patch: https://git.openjdk.java.net/jdk/pull/7704.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7704/head:pull/7704 PR: https://git.openjdk.java.net/jdk/pull/7704 From kbarrett at openjdk.java.net Sat Mar 26 21:59:21 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 26 Mar 2022 21:59:21 GMT Subject: RFR: 8282668: HotSpot Style Guide should permit unrestricted unions [v2] In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 15:38:31 GMT, Vladimir Kozlov wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into unrestricted-union >> - update html >> - unrestricted unions > > Good. Thanks @vnkozlov , @tschatzl , @dholmes-ora , @dcubed-ojdk for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7704 From kbarrett at openjdk.java.net Sat Mar 26 21:59:22 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 26 Mar 2022 21:59:22 GMT Subject: Integrated: 8282668: HotSpot Style Guide should permit unrestricted unions In-Reply-To: References: Message-ID: On Fri, 4 Mar 2022 18:39:33 GMT, Kim Barrett wrote: > Please review this change to permit the use of "unrestricted unions" > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf) in HotSpot > code. > > This permits any non-reference type to be used as a union data member, as well > as permitting static data members in named unions. There are various classes > in HotSpot that might be able to take advantage of this new feature. > > An example is the aarch64-specific Address class. It presently contains a > collection of data members. For any given instance, only some of these data > members are initialized and used. The `_mode` member indicates which. So it's > effectively a kind of discriminated union with the data unpacked and not > overlapping, with `_mode` being the discrimenant. A consequence of the current > implementation is that some compilers may generate warnings under some > circumstances because of uninitialized data members. (I ran into this problem > with gcc when making an otherwise unrelated change to one of the member > types.) This Address class could be made smaller (so cheaper to copy, which > happens often as Address objects are frequently passed by value) and usage > made clearer, by making it an actual union. But that isn't possible with the > C++03 restrictions. > > Another example is the RelocationHolder class, which is effectively a union > over the various concrete Relocation types, but implemented in a way that > has some issues (JDK-8160404). > > Testing: > I've tried some examples without running into any problems. This included > some experiments with RelocationHolder for JDK-8160404. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 18-Mar-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This pull request has now been integrated. Changeset: c2c0cb2a Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/c2c0cb2a4372d78658326461562363de9a1a194f Stats: 4 lines in 2 files changed: 4 ins; 0 del; 0 mod 8282668: HotSpot Style Guide should permit unrestricted unions Reviewed-by: dholmes, dcubed, tschatzl, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/7704 From xlinzheng at openjdk.java.net Mon Mar 28 04:48:09 2022 From: xlinzheng at openjdk.java.net (Xiaolin Zheng) Date: Mon, 28 Mar 2022 04:48:09 GMT Subject: RFR: 8283737: riscv: MacroAssembler::stop() is used in AD file and it should generate a fixed-length size Message-ID: Hi team, Could I have a review of this simple patch? - `PhaseOutput::fill_buffer` detects if the real size of a node matches (<=) the size of it in scratch_emit(). The call chain for MacroAssembler::stop() is: MachEpilogNode::emit -> reserved_stack_check() -> should_not_reach_here() -> stop(const char *msg) `li()` on RISCV could generate 1~6 instructions, and the msg argument could be an on-stack buffer; `stop()` also uses `__ pc()` that could also be different in `scratch_emit()` and `emit()`. They both have the potential issue here so the size generated in `MacroAssembler::stop()` needs to be a fixed value. Could be reproduced in the fastdebug build by adding one line: // Die now. instruct ShouldNotReachHere() %{ match(Halt); ins_cost(BRANCH_COST); format %{ "#@ShouldNotReachHere" %} ins_encode %{ Assembler::CompressibleRegion cr(&_masm); if (is_reachable()) { __ halt(); + __ unimplemented("this is an on-stack char literal"); // assertion fail at 'assert(false, "wrong size of mach node");' } %} ins_pipe(pipe_class_default); %} This patch also fixes a typo introduced in JDK-8278994: `c_bnez` is mistakenly written to `c_beqz`, though not used until now, needing a fix for future usage. Tests passed in hotspot tier1 & jdk tier1 without new errors found. Thanks, Xiaolin ------------- Commit messages: - Fix a typo instroduced when refactoring: c.beqz - MacroAssembler::stop() is used in adfile and it should generate a fixed-length size Changes: https://git.openjdk.java.net/jdk/pull/7982/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7982&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283737 Stats: 5 lines in 2 files changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7982.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7982/head:pull/7982 PR: https://git.openjdk.java.net/jdk/pull/7982 From duke at openjdk.java.net Mon Mar 28 05:32:20 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Mon, 28 Mar 2022 05:32:20 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long Message-ID: Implements x86 intrinsics for compare() method in java.lang.Integer and java.lang.Long. ------------- Commit messages: - add JMH benchmarks - 8283726: x86 intrinsics for compare method in Integer and Long Changes: https://git.openjdk.java.net/jdk/pull/7975/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7975&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283726 Stats: 430 lines in 13 files changed: 428 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7975.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7975/head:pull/7975 PR: https://git.openjdk.java.net/jdk/pull/7975 From duke at openjdk.java.net Mon Mar 28 05:32:20 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Mon, 28 Mar 2022 05:32:20 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long In-Reply-To: References: Message-ID: On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa wrote: > Implements x86 intrinsics for compare() method in java.lang.Integer and java.lang.Long. This is both complicated and inefficient, I would suggest building the intrinsic in the IR graph so that the compiler can simplify `Integer.compareUnsigned(x, y) < 0` into `x u< y`. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7975 From chagedorn at openjdk.java.net Mon Mar 28 06:49:56 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 28 Mar 2022 06:49:56 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 Ping - may I get another review for this change? ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From stefan.karlsson at oracle.com Mon Mar 28 07:39:16 2022 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Mar 2022 09:39:16 +0200 Subject: CFV: New HotSpot Group Member: Albert Mingkun Yang In-Reply-To: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> References: <7B34E834-BCF0-4D4D-A78F-519711CA7120@oracle.com> Message-ID: Vote: yes StefanK On 2022-03-16 13:49, Kim Barrett wrote: > I hereby nominate Albert Mingkun Yang to Membership in the HotSpot Group. > > Albert is a JDK Reviewer and a member of the Oracle GC team. He has made many > substantial contributions [1] including co-authoring an improved GC thread > controller for ZGC. He is a frequent and thorough reviewer, as well as being a > dedicated code deletion engineer, finding many places to reduce complexity or > remove dead code. > > Votes are due by Thursday, 31-March-2022 at 12h00 UTC. > > Only current Members of the HotSpot Group [2] are eligible > to vote on this nomination. Votes must be cast in the open by > replying to this mailing list > > For Lazy Consensus voting instructions, see [3]. > > Kim Barrett > > [1] https://github.com/search?q=author-name%3A%22Albert+Mingkun+Yang%22+repo%3Aopenjdk%2Fjdk+merge%3Afalse&type=Commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From jiefu at openjdk.java.net Mon Mar 28 07:49:49 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 28 Mar 2022 07:49:49 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API [v2] In-Reply-To: References: Message-ID: On Mon, 21 Mar 2022 01:19:57 GMT, Xiaohong Gong wrote: > The compiler can get the real type info from `Op_NegVI` that can also handle the `BYTE ` and `SHORT ` basic type. I just don't want to add more new IRs which also need more match rules in the ad files. > > > Is there any performance drop for byte/short negation operation if both of them are handled as a NegVI vector? > > From the benchmark results I showed in the commit message, I didn't see not any performance drop for byte/short. Thanks! There seems no vectorized negation instructions for {byte, short, int, long} on x86, so this should be fine on x86. I tested the patch on x86 and the performance number looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From xgong at openjdk.java.net Mon Mar 28 07:49:49 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 28 Mar 2022 07:49:49 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API [v2] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 07:40:48 GMT, Jie Fu wrote: >> The compiler can get the real type info from `Op_NegVI` that can also handle the `BYTE ` and `SHORT ` basic type. I just don't want to add more new IRs which also need more match rules in the ad files. >> >>> Is there any performance drop for byte/short negation operation if both of them are handled as a NegVI vector? >> >> From the benchmark results I showed in the commit message, I didn't see not any performance drop for byte/short. Thanks! > >> The compiler can get the real type info from `Op_NegVI` that can also handle the `BYTE ` and `SHORT ` basic type. I just don't want to add more new IRs which also need more match rules in the ad files. >> >> > Is there any performance drop for byte/short negation operation if both of them are handled as a NegVI vector? >> >> From the benchmark results I showed in the commit message, I didn't see not any performance drop for byte/short. Thanks! > > There seems no vectorized negation instructions for {byte, short, int, long} on x86, so this should be fine on x86. > I tested the patch on x86 and the performance number looks good. Thanks for doing this! Yeah, I think the performance for masked negation operations might improve on non avx-512 systems. ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From jiefu at openjdk.java.net Mon Mar 28 07:49:50 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 28 Mar 2022 07:49:50 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API [v2] In-Reply-To: References: Message-ID: On Tue, 22 Mar 2022 09:58:23 GMT, Xiaohong Gong wrote: >> The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. >> And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. >> >> To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: >> - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. >> - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. >> >> So with this patch, the following transformations are applied: >> >> For non-masked negation with NEON: >> >> movi v16.4s, #0x0 >> sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s >> >> and with SVE: >> >> mov z16.s, #0 >> sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s >> >> For masked negation with NEON: >> >> movi v17.4s, #0x1 >> mvn v19.16b, v18.16b >> mov v20.16b, v16.16b ==> neg v18.4s, v17.4s >> bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b >> add v19.4s, v20.4s, v17.4s >> mov v18.16b, v16.16b >> bsl v18.16b, v19.16b, v20.16b >> >> and with SVE: >> >> mov z16.s, #-1 >> mov z17.s, #1 ==> neg z16.s, p0/m, z16.s >> eor z18.s, p0/m, z18.s, z16.s >> add z18.s, p0/m, z18.s, z17.s >> >> Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): >> >> NEON: >> Benchmark Gain >> Byte128Vector.NEG 1.029 >> Byte128Vector.NEGMasked 1.757 >> Short128Vector.NEG 1.041 >> Short128Vector.NEGMasked 1.659 >> Int128Vector.NEG 1.005 >> Int128Vector.NEGMasked 1.513 >> Long128Vector.NEG 1.003 >> Long128Vector.NEGMasked 1.878 >> >> SVE with 512-bits: >> Benchmark Gain >> ByteMaxVector.NEG 1.10 >> ByteMaxVector.NEGMasked 1.165 >> ShortMaxVector.NEG 1.056 >> ShortMaxVector.NEGMasked 1.195 >> IntMaxVector.NEG 1.002 >> IntMaxVector.NEGMasked 1.239 >> LongMaxVector.NEG 1.031 >> LongMaxVector.NEGMasked 1.191 >> >> X86 (non AVX-512): >> Benchmark Gain >> ByteMaxVector.NEGMasked 1.254 >> ShortMaxVector.NEGMasked 1.359 >> IntMaxVector.NEGMasked 1.431 >> LongMaxVector.NEGMasked 1.989 >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Add a superclass for vector negation src/hotspot/share/opto/vectornode.cpp line 1592: > 1590: > 1591: // Generate other vector nodes to implement the masked/non-masked vector negation. > 1592: Node* VectorNode::degenerate_vector_integral_negate(Node* n, int vlen, BasicType bt, PhaseGVN* phase, bool is_predicated) { Shall we move this declaration in `class NegVNode` since it is only used by NegVNode::Ideal ? ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From xgong at openjdk.java.net Mon Mar 28 07:49:50 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 28 Mar 2022 07:49:50 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API [v2] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 07:43:29 GMT, Jie Fu wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Add a superclass for vector negation > > src/hotspot/share/opto/vectornode.cpp line 1592: > >> 1590: >> 1591: // Generate other vector nodes to implement the masked/non-masked vector negation. >> 1592: Node* VectorNode::degenerate_vector_integral_negate(Node* n, int vlen, BasicType bt, PhaseGVN* phase, bool is_predicated) { > > Shall we move this declaration in `class NegVNode` since it is only used by NegVNode::Ideal ? I think it can be. Thanks for pointing out this. I will change this later. ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From jiefu at openjdk.java.net Mon Mar 28 08:03:50 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 28 Mar 2022 08:03:50 GMT Subject: RFR: 8282162: [vector] Optimize vector negation API In-Reply-To: References: <-E5E_NBci6gsGyOV5nWuTUNKLVnjiw2IiWjjgv2vFz0=.ebe7c447-ede9-4437-815c-a2004f9d6ce1@github.com> Message-ID: <4Ecgum5Cb8oDUN02LIkOsU_3tYOLXASEq1znc3JhT28=.d69bc461-d84c-405a-9aad-945b04ca1a1e@github.com> On Tue, 15 Mar 2022 02:47:20 GMT, Xiaohong Gong wrote: > > Note that in terms of Java semantics, negation of floating point values needs to be implemented as subtraction from negative zero rather than positive zero: > > double negate(double arg) {return -0.0 - arg; } > > This is to handle signed zeros correctly. > > Hi @jddarcy ,thanks for looking at this PR and thanks for the notes on the floating point negation! Yeah, this really makes sense to me. Kindly note that this patch didn't touch the negation of the floating point values. For Vector API, the vector floating point negation has been intrinsified to `NegVF/D` node by compiler that we directly generate the negation instructions for them. Thanks! I would suggest changing the JBS title like `[vector] Optimize non-floating vector negation API` . ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From stuefe at openjdk.java.net Mon Mar 28 08:47:52 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 28 Mar 2022 08:47:52 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 06:46:30 GMT, Christian Hagedorn wrote: > Ping - may I get another review for this change? I'll take a look later today or tomorrow. This is nice stuff, I'm looking forward to having it upstream. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From chagedorn at openjdk.java.net Mon Mar 28 09:13:44 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 28 Mar 2022 09:13:44 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 08:44:10 GMT, Thomas Stuefe wrote: > > Ping - may I get another review for this change? > > I'll take a look later today or tomorrow. This is nice stuff, I'm looking forward to having it upstream. That's great, thanks Thomas! ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From ysuenaga at openjdk.java.net Mon Mar 28 09:16:56 2022 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 28 Mar 2022 09:16:56 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 As I said before, I think it would be nice to share DWARF parser between SA and HotSpot. Can you expose these mechanisms? It may be another RFE, and may need to think other platforms. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From xgong at openjdk.java.net Mon Mar 28 09:56:22 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Mon, 28 Mar 2022 09:56:22 GMT Subject: RFR: 8282162: [vector] Optimize integral vector negation API [v3] In-Reply-To: References: Message-ID: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> > The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. > And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. > > To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: > - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. > - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. > > So with this patch, the following transformations are applied: > > For non-masked negation with NEON: > > movi v16.4s, #0x0 > sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s > > and with SVE: > > mov z16.s, #0 > sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s > > For masked negation with NEON: > > movi v17.4s, #0x1 > mvn v19.16b, v18.16b > mov v20.16b, v16.16b ==> neg v18.4s, v17.4s > bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b > add v19.4s, v20.4s, v17.4s > mov v18.16b, v16.16b > bsl v18.16b, v19.16b, v20.16b > > and with SVE: > > mov z16.s, #-1 > mov z17.s, #1 ==> neg z16.s, p0/m, z16.s > eor z18.s, p0/m, z18.s, z16.s > add z18.s, p0/m, z18.s, z17.s > > Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): > > NEON: > Benchmark Gain > Byte128Vector.NEG 1.029 > Byte128Vector.NEGMasked 1.757 > Short128Vector.NEG 1.041 > Short128Vector.NEGMasked 1.659 > Int128Vector.NEG 1.005 > Int128Vector.NEGMasked 1.513 > Long128Vector.NEG 1.003 > Long128Vector.NEGMasked 1.878 > > SVE with 512-bits: > Benchmark Gain > ByteMaxVector.NEG 1.10 > ByteMaxVector.NEGMasked 1.165 > ShortMaxVector.NEG 1.056 > ShortMaxVector.NEGMasked 1.195 > IntMaxVector.NEG 1.002 > IntMaxVector.NEGMasked 1.239 > LongMaxVector.NEG 1.031 > LongMaxVector.NEGMasked 1.191 > > X86 (non AVX-512): > Benchmark Gain > ByteMaxVector.NEGMasked 1.254 > ShortMaxVector.NEGMasked 1.359 > IntMaxVector.NEGMasked 1.431 > LongMaxVector.NEGMasked 1.989 > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Make "degenerate_vector_integral_negate" to be "NegVI" private ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7782/files - new: https://git.openjdk.java.net/jdk/pull/7782/files/97c8119a..48f4d6be Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7782&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7782&range=01-02 Stats: 15 lines in 2 files changed: 6 ins; 1 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/7782.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7782/head:pull/7782 PR: https://git.openjdk.java.net/jdk/pull/7782 From chagedorn at openjdk.java.net Mon Mar 28 11:21:41 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 28 Mar 2022 11:21:41 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 09:13:39 GMT, Yasumasa Suenaga wrote: > As I said before, I think it would be nice to share DWARF parser between SA and HotSpot. Can you expose these mechanisms? It may be another RFE, and may need to think other platforms. That would be good to have. However, I'm not familiar with the SA code and how it works to share code with HotSpot. And I'm also not sure how much overlap the two parsers actually have. I quickly skimmed through the DWARF parsing code of the SA and it seems that it's main usage is for parsing call frame information (as described in section 6.4 of the DWARF 4 spec) which is not supported/needed in this patch. There is still some code that could be shared though like opening a DWARF file with its checks or reading an LEB 128 etc. Might be worth to investigate further if the two implementations can be merged/reused to some extent. But I propose to file a separate RFE for that. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From simonis at openjdk.java.net Mon Mar 28 11:24:42 2022 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Mar 2022 11:24:42 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M Thanks, looks good now. ------------- Marked as reviewed by simonis (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7517 From dholmes at openjdk.java.net Mon Mar 28 11:42:41 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Mar 2022 11:42:41 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() In-Reply-To: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Wed, 23 Mar 2022 06:19:38 GMT, Ioi Lam wrote: > - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` > - Call the appropriate parsing function according to the type of the flag > - Added more test cases for flags of the `double` type. > > As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. > > Tested with tiers 1-5. src/hotspot/share/runtime/arguments.cpp line 905: > 903: return false; > 904: } > 905: if (g_isnan(v) || !g_isfinite(v)) { Surely the not-sign should not be there ??? ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From dholmes at openjdk.java.net Mon Mar 28 12:02:48 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Mar 2022 12:02:48 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() In-Reply-To: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Wed, 23 Mar 2022 06:19:38 GMT, Ioi Lam wrote: > - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` > - Call the appropriate parsing function according to the type of the flag > - Added more test cases for flags of the `double` type. > > As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. > > Tested with tiers 1-5. Hi Ioi, This certainly seems a lot clearer and cleaner! I have a few initial comments below. Thanks, David src/hotspot/share/runtime/arguments.cpp line 896: > 894: > 895: static bool set_fp_numeric_flag(JVMFlag* flag, const char* value, JVMFlagOrigin origin) { > 896: if (*value == '\0' || isspace(*value)) { Please preceded with comment // strtod allows leading whitespace, but our flag format does not. src/hotspot/share/runtime/arguments.cpp line 1062: > 1060: if (('a' <= c && c <= 'z' ) || > 1061: ('A' <= c && c <= 'Z' ) || > 1062: ('0' <= c && c <= '9' ) || can't you use `isalnum(c)` ? src/hotspot/share/runtime/arguments.cpp line 1101: > 1099: if (arg[0] == ':' && arg[1] == '=') { > 1100: // -XX:Foo:=xxx will reset the string flag to the given value. > 1101: const char* value = arg + 2; I wasn't aware of this syntax and I can't see what the difference is between = and := in our code (this seems to be something from makefile variable setting logic!). ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From stuefe at openjdk.java.net Mon Mar 28 13:19:19 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 28 Mar 2022 13:19:19 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 Hi Christian, this is impressive work. It's a big change, and I had a look at part of it. I'll continue tomorrow. In general, I'm concerned with the use of both UL and ResourceArea in this code. I know the use of UL has been discussed, but still. Use of RA will prevent us from getting useful callstacks if we crash and Thread::current is NULL or invalid. I'd feel better if we were to consistently rely on an outside scratch buffer (like we usually do in error reporting). Even raw ::malloc would be better IMHO. Another concern was safety, since this is a potential attack vector with manipulated Dwarf files, if someone manages to provove a crash. Maybe far fetched, but still. Would be good to get SonarCloud readings for this code e.g. More remarks inline. Cheers, Thomas src/hotspot/os/windows/decoder_windows.cpp line 39: > 37: > 38: bool Decoder::get_source_info(address pc, char* filename, size_t filename_len, int* line, bool is_pc_after_call) { > 39: return SymbolEngine::get_source_info(pc, filename, filename_len, line); Are all these renaming changes on Windows necessary? Would be easier to review without, and easier to backport. That could be done in a separate RFE if needed. src/hotspot/share/utilities/decoder.hpp line 120: > 118: // file name. File name will be silently truncated if output buffer is too small. > 119: // If is_pc_after_call is true, then pc is treated as pointing to the next instruction > 120: // after a call. The source information for the call instruction is fetched in that case. I tried to understand how this is used. In NMT, you set it depending on whether the PC belongs to the lowest frame. But that frame not really the lowest frame - `NativeCallStack()` -> `os::get_native_stack` skips the first n frames. I don't understand what makes the first non-skipped frame different from the others. src/hotspot/share/utilities/decoder_elf.cpp line 59: > 57: if (filename == nullptr || filename_len <= 0 || line == nullptr) { > 58: return false; > 59: } I'd just assert here. src/hotspot/share/utilities/elfFile.cpp line 119: > 117: return; > 118: } > 119: strcpy(_filepath, filepath); This whole section could be shortened by using os::strdup() src/hotspot/share/utilities/elfFile.cpp line 132: > 130: if (_filepath != NULL) { > 131: os::free((void*)_filepath); > 132: } Since you removed the NULL checks in similar places, you could remove it here too src/hotspot/share/utilities/elfFile.cpp line 136: > 134: delete _shdr_string_table; > 135: _shdr_string_table = nullptr; > 136: delete _next; Recursive delete; may run out of stack, especially if this runs in error reporting. Who knows how much stack we have. Before your patch this may have been optimized away with TCO, but not anymore. src/hotspot/share/utilities/elfFile.cpp line 289: > 287: return sect_index; > 288: } > 289: This was introduced by Volker as part of JDK-8019929. It is still used in PPC coding (see above). Any reason you removed this? You may get failing builds on PPC. src/hotspot/share/utilities/elfFile.cpp line 309: > 307: bool ElfFile::get_source_info(const uint32_t offset_in_library, char* filename, const size_t filename_len, int* line, bool is_pc_after_call) { > 308: ResourceMark rm; > 309: // (1) Wheres (2)? :) src/hotspot/share/utilities/elfFile.cpp line 324: > 322: > 323: // Store result in filename and line pointer. > 324: if (!_dwarf_file->get_filename_and_line_number(offset_in_library, *filename, filename_len, *line, is_pc_after_call)) { Could we stick with pointer syntax here instead of switching to refs, since this is what we do in the other places? src/hotspot/share/utilities/elfFile.cpp line 350: > 348: > 349: const size_t dwarf_filepath_len = strlen(dwarf_filename) + strlen(_filepath) + strlen(".debug/") > 350: + strlen(usr_lib_debug_directory()) + 2; unrelated to your patch, but I'd just wire up usr_lib_debug_directory as a constant. src/hotspot/share/utilities/elfFile.cpp line 357: > 355: } > 356: > 357: DwarfFilePath dwarf_file_path(dwarf_filename, dwarf_filepath_buf, dwarf_filepath_len); I needed a while to comprehend this. So, `dwarf_filename` is actually not just the name but the whole .debuginfo section. Our intent is to read the whole section - name + padding + CRC - and pass it into DwarfFilePath, which extracts the file name and the CRC checksum, right? What makes me itchy is that we have no guarantee and no checks that we actually read the whole section, that the file name is actually zero-terminated, as it is supposed to be, and that, when accessing the supposed crc, we are still within whatever we allocated. If I understood correctly: I would prefer a better naming for dwarf_filename (proposal: "debuginfo_section"). Maybe a different type, since "const char*" is misleading. And I would rename `get_dwarf_filename()` similarly, and let it return the section size too, and verify that size later when extracting the CRC, and also verify that the name is zero terminated. E.g. to defend against malicious attacks with manipulated debug infos. src/hotspot/share/utilities/elfFile.cpp line 378: > 376: } > 377: > 378: char* debug_filename = NEW_RESOURCE_ARRAY(char, shdr.sh_size); With things read from outside, I would sanity check the size. E.g. give it a reasonable max like either file size or, Idk, 512M or so. src/hotspot/share/utilities/elfFile.cpp line 450: > 448: if (buf == nullptr) { > 449: return false; > 450: } I'd move this close to and local to where it is used. Also, you seem to repeat the same pattern a lot "NEW_RESOURCE_ARRAY(n), if error return something". I'd factor this out to an utility function or utility macro, maybe one where you pass the error return value as macro parameter. src/hotspot/share/utilities/elfFile.cpp line 452: > 450: } > 451: > 452: ElfStringTable* const table = _shdr_string_table; You only use this in one place, I'd probably just use _shdr_string_table below. src/hotspot/share/utilities/elfFile.cpp line 464: > 462: if (table->string_at(hdr.sh_name, buf, len)) { > 463: if (strncmp(buf, name, len) == 0) { > 464: return true; Would things like this be worth a debug assert? Since this seems to indicate either a corrupted dwarf file or something is wrong with our understanding how dwarf works. src/hotspot/share/utilities/elfFile.cpp line 546: > 544: // Must be equal, otherwise the file is corrupted. > 545: return create_new_dwarf_file(filepath); > 546: } Since you bail out usually on conditions, I'd reverse the logic here too, "if CRC invalid then log warning and return false". src/hotspot/share/utilities/elfFile.hpp line 212: > 210: const char* _filename; > 211: char* _path; > 212: const size_t _path_len; I'd make an explicit comment that _path_len is supposed to include term. zero. I'd also rename it to something like _out and _out_len, would IMHO be clearer to the casual reader. src/hotspot/share/utilities/elfFile.hpp line 217: > 215: public: > 216: DwarfFilePath(const char* filename, char* buf, size_t buf_len) : _filename(filename), _path(buf), _path_len(buf_len) { > 217: size_t offset = (strlen(filename) + 4) >> 2u; Why not use align()? Would be more readable. src/hotspot/share/utilities/elfFile.hpp line 234: > 232: > 233: void set(const char* src) { > 234: strncpy(_path, src, _path_len); Won't zero-terminate if srclen >= _path_len. To be sure, I'd use `jio_snprintf(_path, _pathlen, "%s", src);`. Or somesuch. src/hotspot/share/utilities/elfFile.hpp line 235: > 233: void set(const char* src) { > 234: strncpy(_path, src, _path_len); > 235: } I may be paranoid, but I would add a canary at the end of the output buffer, debug only, and check that in ~DwarfFilePath. Either that or use os::malloc. With os::malloc(), you get to use NMT and get overwrite recognition for free (we recently reworked that and it works more reliably now, and in release too). src/hotspot/share/utilities/elfFile.hpp line 243: > 241: void set_after_last_slash(const char* src) { > 242: char* last_slash = strrchr(_path, '/'); > 243: strncpy(last_slash + 1, src, _path_len); This does not guard against overwrites. _path_len is the wrong length to use here since its the total length, not the last part. Also strncpy does not truncate. Also maybe verify that there is an actual slash. Proposal: char* const last_slash = strrchr(_path, '/'); if (last_slash) { size_t remaining = _path_len - (last_slash - _path) - 1; if (remaining > 0) { jio_snprintf(last_slash + 1, remaining, "%s", src); } } src/hotspot/share/utilities/elfFile.hpp line 247: > 245: > 246: void append(const char* src) { > 247: strncat(_path, src, _path_len); strncat "Appends the first num characters of source to destination, plus a terminating null-character." So, it should be _path_len - 1, otherwise the terminating zero may overwrite the buffer end ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7126 From stuefe at openjdk.java.net Mon Mar 28 13:19:22 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 28 Mar 2022 13:19:22 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v4] In-Reply-To: References: Message-ID: On Tue, 8 Feb 2022 08:17:17 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Make dwarf tag NOT_PRODUCT src/hotspot/share/utilities/decoder_elf.cpp line 67: > 65: if (!os::dll_address_to_library_name(pc, filepath, sizeof(filepath), &offset_in_library) || offset_in_library < 0) { > 66: // Method not found. offset_in_library should not overflow. > 67: log_develop_info(dwarf)("Did not find library for address " INTPTR_FORMAT, p2i(pc)); I know this has been discussed and decided, but I feel uncomfortable about this logging here. Also because it sets a precedent for using UL inside signal handling. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From stuefe at openjdk.java.net Mon Mar 28 13:19:23 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 28 Mar 2022 13:19:23 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v4] In-Reply-To: References: Message-ID: On Thu, 24 Feb 2022 08:14:34 GMT, Thomas Stuefe wrote: >> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: >> >> Make dwarf tag NOT_PRODUCT > > src/hotspot/share/utilities/decoder_elf.cpp line 67: > >> 65: if (!os::dll_address_to_library_name(pc, filepath, sizeof(filepath), &offset_in_library) || offset_in_library < 0) { >> 66: // Method not found. offset_in_library should not overflow. >> 67: log_develop_info(dwarf)("Did not find library for address " INTPTR_FORMAT, p2i(pc)); > > I know this has been discussed and decided, but I feel uncomfortable about this logging here. Also because it sets a precedent for using UL inside signal handling. Note, if you do log, it would be nice to be precise and distinguish between dll_address_to_library_name returning false and returning an offset outside the library bounds. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From stuefe at openjdk.java.net Mon Mar 28 13:19:24 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 28 Mar 2022 13:19:24 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 12:20:44 GMT, Thomas Stuefe wrote: >> Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: >> >> - Updating some comments >> - Cleanup loading dwarf file and add summary >> - Review comments of first pass by Thomas except dwarf file loading >> - Merge branch 'master' into JDK-8242181 >> - Make dwarf tag NOT_PRODUCT >> - Change log_* to log_develop_* and log_warning to log_develop_info >> - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java >> >> Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> >> - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java >> >> Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> >> - Better formatting of trace output >> - some code move and more cleanups >> - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 > > src/hotspot/share/utilities/elfFile.hpp line 217: > >> 215: public: >> 216: DwarfFilePath(const char* filename, char* buf, size_t buf_len) : _filename(filename), _path(buf), _path_len(buf_len) { >> 217: size_t offset = (strlen(filename) + 4) >> 2u; > > Why not use align()? Would be more readable. See also comments in load_dwarf_file(). I'd really be more happy with more verifications. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From rkennke at openjdk.java.net Mon Mar 28 13:29:12 2022 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 28 Mar 2022 13:29:12 GMT Subject: RFR: 8283710: JVMTI: GC abstraction for ObjectMarker Message-ID: JVMTI heap walking marks objects in order to track which have been visited already. In order to do that, it uses bits in the object header. Those are the same bits that are also used by some GCs to mark objects (the lowest two bits, also used by locking code). Some GCs also use the bits in order to indicate 'forwarded' objects, where the upper bits of the header represent the forward-pointer. In the case of Shenandoah, it's even more problematic because this happens concurrently, even while JVMTI heap walks can intercept. So far we carefully worked around that problem, but it becomes very problematic in Lilliput, where accesses to the Klass* also requires to decode the header, and figure out what bits means what. Taking a step back, it should not be JVMTI's business to mess with GC marking bits. Instead, the GC should provide this functionality to JVMTI, and implement it in a way that is suitable for the active GC. For example, in Shenandoah GC we would probably rather use a marking bitmap instead of letting JVMTI mark in the object header. I would like to propose a GC abstraction to enable this. The proposed change provides an abstract class ObjectMarker, a single implementation HeaderObjectMarker (https://github.com/openjdk/lilliput/pull/45 proposes another implementation that uses bitmaps) and an ObjectMarkerController which manages the lifecycle. IMO, this is cleaner than the current impl in jvmtiTagMap, it keeps all state in the ObjectMarker implementation, rather than using global state and separates concerns better. Testing: - [x] tier1 - [x] tier2 - [ ] tier3 ------------- Commit messages: - Restore missing include - Simpler needs_reset handling - 8283710: JVMTI: GC abstraction for ObjectMarker Changes: https://git.openjdk.java.net/jdk/pull/7964/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7964&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283710 Stats: 361 lines in 5 files changed: 222 ins; 133 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/7964.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7964/head:pull/7964 PR: https://git.openjdk.java.net/jdk/pull/7964 From shade at openjdk.java.net Mon Mar 28 17:34:07 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 28 Mar 2022 17:34:07 GMT Subject: RFR: 8281469: aarch64: Improve interpreter stack banging Message-ID: <7dK0Nt08VRU-Hygxo6QR6URyW9grnMxoW2SJ-jSh3Cc=.c639777e-10b4-495f-9188-30377b6bf9cd@github.com> This is the AArch64 counterpart of X86 change: https://github.com/openjdk/jdk/commit/3a13425bc9088cbb6d95e1a46248d7eba27fb1a6. Motivational performance improvements on Raspberry Pi 3: Performance counter stats for 'baseline/bin/java -version' (10 runs): 476.96 msec task-clock # 1.288 CPUs utilized ( +- 0.11% ) 166 context-switches # 0.348 K/sec ( +- 0.93% ) 8 cpu-migrations # 0.017 K/sec ( +- 9.33% ) 2,954 page-faults # 0.006 M/sec ( +- 0.04% ) 560,690,251 cycles # 1.176 GHz ( +- 0.07% ) 239,068,958 instructions # 0.43 insn per cycle ( +- 0.04% ) 30,236,426 branches # 63.394 M/sec ( +- 0.05% ) 4,145,994 branch-misses # 13.71% of all branches ( +- 0.09% ) 0.370225 +- 0.000285 seconds time elapsed ( +- 0.08% ) Performance counter stats for 'patched/bin/java -version' (10 runs): 456.01 msec task-clock # 1.283 CPUs utilized ( +- 0.12% ) 156 context-switches # 0.341 K/sec ( +- 0.99% ) 8 cpu-migrations # 0.018 K/sec ( +- 4.30% ) 2,957 page-faults # 0.006 M/sec ( +- 0.07% ) 536,970,476 cycles # 1.178 GHz ( +- 0.12% ) 236,527,954 instructions # 0.44 insn per cycle ( +- 0.04% ) 30,195,820 branches # 66.218 M/sec ( +- 0.04% ) 4,128,388 branch-misses # 13.67% of all branches ( +- 0.13% ) 0.355460 +- 0.000741 seconds time elapsed ( +- 0.21% ) SPECjvm2008 with `-Xint`: Compress: +54% Serial: +56% Additional testing: - [x] Linux aarch64 fastdebug, `tier1` - [x] Linux aarch64 fastdebug, `tier2` - [x] Ad-hoc benchmarks ------------- Commit messages: - Unsigned condition codes - Do not do str(sp) - Fix Changes: https://git.openjdk.java.net/jdk/pull/8001/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8001&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8281469 Stats: 40 lines in 1 file changed: 32 ins; 3 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/8001.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8001/head:pull/8001 PR: https://git.openjdk.java.net/jdk/pull/8001 From simonis at openjdk.java.net Mon Mar 28 18:12:53 2022 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Mar 2022 18:12:53 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: <_7Yx2jG7Oj2pPZbnXP0t-rdLbIXCCNSj8_u05BkQKRo=.0c532ba7-2acf-4a4c-ab3c-5406da696448@github.com> On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 64: > 62: } > 63: __ bind(l); > 64: assert((uintptr_t)__ pc() % wordSize == 0); Looks like this leads to a compilation failure in the debug build. Can you please check before submitting? ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From kvn at openjdk.java.net Mon Mar 28 18:37:55 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 28 Mar 2022 18:37:55 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M Good suggestion. Let me test it before approval. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From aph at openjdk.java.net Mon Mar 28 19:58:47 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 28 Mar 2022 19:58:47 GMT Subject: RFR: 8281469: aarch64: Improve interpreter stack banging In-Reply-To: <7dK0Nt08VRU-Hygxo6QR6URyW9grnMxoW2SJ-jSh3Cc=.c639777e-10b4-495f-9188-30377b6bf9cd@github.com> References: <7dK0Nt08VRU-Hygxo6QR6URyW9grnMxoW2SJ-jSh3Cc=.c639777e-10b4-495f-9188-30377b6bf9cd@github.com> Message-ID: On Mon, 28 Mar 2022 17:26:56 GMT, Aleksey Shipilev wrote: > This is the AArch64 counterpart of X86 change: https://github.com/openjdk/jdk/commit/3a13425bc9088cbb6d95e1a46248d7eba27fb1a6. > > Motivational performance improvements on Raspberry Pi 3: > > > Performance counter stats for 'baseline/bin/java -version' (10 runs): > > 476.96 msec task-clock # 1.288 CPUs utilized ( +- 0.11% ) > 166 context-switches # 0.348 K/sec ( +- 0.93% ) > 8 cpu-migrations # 0.017 K/sec ( +- 9.33% ) > 2,954 page-faults # 0.006 M/sec ( +- 0.04% ) > 560,690,251 cycles # 1.176 GHz ( +- 0.07% ) > 239,068,958 instructions # 0.43 insn per cycle ( +- 0.04% ) > 30,236,426 branches # 63.394 M/sec ( +- 0.05% ) > 4,145,994 branch-misses # 13.71% of all branches ( +- 0.09% ) > > 0.370225 +- 0.000285 seconds time elapsed ( +- 0.08% ) > > Performance counter stats for 'patched/bin/java -version' (10 runs): > > 456.01 msec task-clock # 1.283 CPUs utilized ( +- 0.12% ) > 156 context-switches # 0.341 K/sec ( +- 0.99% ) > 8 cpu-migrations # 0.018 K/sec ( +- 4.30% ) > 2,957 page-faults # 0.006 M/sec ( +- 0.07% ) > 536,970,476 cycles # 1.178 GHz ( +- 0.12% ) > 236,527,954 instructions # 0.44 insn per cycle ( +- 0.04% ) > 30,195,820 branches # 66.218 M/sec ( +- 0.04% ) > 4,128,388 branch-misses # 13.67% of all branches ( +- 0.13% ) > > 0.355460 +- 0.000741 seconds time elapsed ( +- 0.21% ) > > > SPECjvm2008 with `-Xint`: > > > Compress: +54% > Serial: +56% > > > Additional testing: > - [x] Linux aarch64 fastdebug, `tier1` > - [x] Linux aarch64 fastdebug, `tier2` > - [x] Ad-hoc benchmarks Hard to argue with that. ------------- Marked as reviewed by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/8001 From iklam at openjdk.java.net Mon Mar 28 20:56:08 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 28 Mar 2022 20:56:08 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v2] In-Reply-To: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: > - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` > - Call the appropriate parsing function according to the type of the flag > - Added more test cases for flags of the `double` type. > > As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. > > Tested with tiers 1-5. Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: - Disabled test for CompileThresholdScaling due to JDK-8283807 - @dholmes-ora comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7916/files - new: https://git.openjdk.java.net/jdk/pull/7916/files/8413d876..0faa4cca Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=00-01 Stats: 10 lines in 2 files changed: 4 ins; 3 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7916.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7916/head:pull/7916 PR: https://git.openjdk.java.net/jdk/pull/7916 From iklam at openjdk.java.net Mon Mar 28 21:00:57 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 28 Mar 2022 21:00:57 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v2] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Mon, 28 Mar 2022 11:51:52 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: >> >> - Disabled test for CompileThresholdScaling due to JDK-8283807 >> - @dholmes-ora comments > > src/hotspot/share/runtime/arguments.cpp line 896: > >> 894: >> 895: static bool set_fp_numeric_flag(JVMFlag* flag, const char* value, JVMFlagOrigin origin) { >> 896: if (*value == '\0' || isspace(*value)) { > > Please preceded with comment > > // strtod allows leading whitespace, but our flag format does not. Fixed. > src/hotspot/share/runtime/arguments.cpp line 905: > >> 903: return false; >> 904: } >> 905: if (g_isnan(v) || !g_isfinite(v)) { > > Surely the not-sign should not be there ??? This is actually correct. If the number is NOT FINITE, we don't accept it. There's a test case for values like `"Infinity"`: https://github.com/openjdk/jdk/blob/8413d87666e58c18c914b7df043ba3dbc6fa9022/test/hotspot/gtest/runtime/test_arguments.cpp#L280-L287 > src/hotspot/share/runtime/arguments.cpp line 1062: > >> 1060: if (('a' <= c && c <= 'z' ) || >> 1061: ('A' <= c && c <= 'Z' ) || >> 1062: ('0' <= c && c <= '9' ) || > > can't you use `isalnum(c)` ? Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From kvn at openjdk.java.net Mon Mar 28 21:01:02 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 28 Mar 2022 21:01:02 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M test/hotspot/jtreg/compiler/c2/aarch64/TestFarJump.java line 41: > 39: * @requires vm.compiler2.enabled > 40: * > 41: * @run driver compiler.c2.TestFarJump Package name `compiler.c2.aarch64` is different from `compiler.c2.TestFarJump`. Got testing failure: java.lang.ClassNotFoundException: compiler.c2.TestFarJump at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445) ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From kvn at openjdk.java.net Mon Mar 28 21:37:45 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 28 Mar 2022 21:37:45 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 58: > 56: // IC stub code size is not expected to vary depending on target address. > 57: // We use NOPs to make the ldr+far_jump+int64 size equal to ic_stub_code_size. > 58: for (int i = jump_code_size; Need more explanation for loop's boundaries arithmetic. The comment includes `ldr` but you start from jump instruction size. Is it part of `3*instruction_size`? What are `3` instructions? ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From amenkov at openjdk.java.net Mon Mar 28 22:27:00 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Mon, 28 Mar 2022 22:27:00 GMT Subject: RFR: 8283597: [REDO] Invalid generic signature for redefined classes Message-ID: <36OKI2Q3esz7ktGKR04B_uHpJgrZ1XqP_09L1FMLmbc=.7c60a665-398e-42ed-b8e3-29a568304037@github.com> After pushing fix for JDK-8282241 (https://github.com/openjdk/jdk/pull/7676) random tests from serviceability/jvmti/RedefineClasses start to fail with java.lang.NoClassDefFoundError: jdk/test/lib/helpers/ClassFileInstaller$Manifest This is caused by JTReg classpath directories sharing between tests. Research shown that the issue was caused by using run compile -g RedefineGenericSignatureTest.java in the test to include additional debug info. Actually "-g" it's not needed as the test only needs source file data and it's included by default. The fix is the same as previous one, the only difference is in the test: - removed "run compile -g RedefineGenericSignatureTest.java" action; - removed "-g" option from InMemoryJavaCompiler.compile() call. @coleenp , @sspitsyn : could you please re-review the fix ------------- Commit messages: - redo gen sig Changes: https://git.openjdk.java.net/jdk/pull/8007/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8007&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283597 Stats: 211 lines in 2 files changed: 201 ins; 7 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/8007.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8007/head:pull/8007 PR: https://git.openjdk.java.net/jdk/pull/8007 From dholmes at openjdk.java.net Mon Mar 28 22:28:52 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Mar 2022 22:28:52 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 12:58:20 GMT, Thomas Stuefe wrote: >> Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: >> >> - Updating some comments >> - Cleanup loading dwarf file and add summary >> - Review comments of first pass by Thomas except dwarf file loading >> - Merge branch 'master' into JDK-8242181 >> - Make dwarf tag NOT_PRODUCT >> - Change log_* to log_develop_* and log_warning to log_develop_info >> - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java >> >> Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> >> - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java >> >> Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> >> - Better formatting of trace output >> - some code move and more cleanups >> - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 > > src/hotspot/share/utilities/elfFile.cpp line 450: > >> 448: if (buf == nullptr) { >> 449: return false; >> 450: } > > I'd move this close to and local to where it is used. > > Also, you seem to repeat the same pattern a lot "NEW_RESOURCE_ARRAY(n), if error return something". I'd factor this out to an utility function or utility macro, maybe one where you pass the error return value as macro parameter. Thomas's comment caught my attention in the email. NEW_RESOURCE_ARRAY aborts the VM on OOM. Use NEW_RESOURCE_ARRAY_RETURN_NULL if you want to continue. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From dholmes at openjdk.java.net Mon Mar 28 22:33:43 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Mar 2022 22:33:43 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v2] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Mon, 28 Mar 2022 20:57:50 GMT, Ioi Lam wrote: >> src/hotspot/share/runtime/arguments.cpp line 905: >> >>> 903: return false; >>> 904: } >>> 905: if (g_isnan(v) || !g_isfinite(v)) { >> >> Surely the not-sign should not be there ??? > > This is actually correct. If the number is NOT FINITE, we don't accept it. There's a test case for values like `"Infinity"`: > > https://github.com/openjdk/jdk/blob/8413d87666e58c18c914b7df043ba3dbc6fa9022/test/hotspot/gtest/runtime/test_arguments.cpp#L280-L287 Sorry my brain read that as `g_isinfinite()` - doh! ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From xliu at openjdk.java.net Mon Mar 28 22:36:40 2022 From: xliu at openjdk.java.net (Xin Liu) Date: Mon, 28 Mar 2022 22:36:40 GMT Subject: RFR: 8281469: aarch64: Improve interpreter stack banging In-Reply-To: <7dK0Nt08VRU-Hygxo6QR6URyW9grnMxoW2SJ-jSh3Cc=.c639777e-10b4-495f-9188-30377b6bf9cd@github.com> References: <7dK0Nt08VRU-Hygxo6QR6URyW9grnMxoW2SJ-jSh3Cc=.c639777e-10b4-495f-9188-30377b6bf9cd@github.com> Message-ID: On Mon, 28 Mar 2022 17:26:56 GMT, Aleksey Shipilev wrote: > This is the AArch64 counterpart of X86 change: https://github.com/openjdk/jdk/commit/3a13425bc9088cbb6d95e1a46248d7eba27fb1a6. > > Motivational performance improvements on Raspberry Pi 3: > > > Performance counter stats for 'baseline/bin/java -version' (10 runs): > > 476.96 msec task-clock # 1.288 CPUs utilized ( +- 0.11% ) > 166 context-switches # 0.348 K/sec ( +- 0.93% ) > 8 cpu-migrations # 0.017 K/sec ( +- 9.33% ) > 2,954 page-faults # 0.006 M/sec ( +- 0.04% ) > 560,690,251 cycles # 1.176 GHz ( +- 0.07% ) > 239,068,958 instructions # 0.43 insn per cycle ( +- 0.04% ) > 30,236,426 branches # 63.394 M/sec ( +- 0.05% ) > 4,145,994 branch-misses # 13.71% of all branches ( +- 0.09% ) > > 0.370225 +- 0.000285 seconds time elapsed ( +- 0.08% ) > > Performance counter stats for 'patched/bin/java -version' (10 runs): > > 456.01 msec task-clock # 1.283 CPUs utilized ( +- 0.12% ) > 156 context-switches # 0.341 K/sec ( +- 0.99% ) > 8 cpu-migrations # 0.018 K/sec ( +- 4.30% ) > 2,957 page-faults # 0.006 M/sec ( +- 0.07% ) > 536,970,476 cycles # 1.178 GHz ( +- 0.12% ) > 236,527,954 instructions # 0.44 insn per cycle ( +- 0.04% ) > 30,195,820 branches # 66.218 M/sec ( +- 0.04% ) > 4,128,388 branch-misses # 13.67% of all branches ( +- 0.13% ) > > 0.355460 +- 0.000741 seconds time elapsed ( +- 0.21% ) > > > SPECjvm2008 with `-Xint`: > > > Compress: +54% > Serial: +56% > > > Additional testing: > - [x] Linux aarch64 fastdebug, `tier1` > - [x] Linux aarch64 fastdebug, `tier2` > - [x] Ad-hoc benchmarks LGTM. I am not a reviwer. need other reviewer to approve it. One subtlety is that sub can only encode uimm24. I think it's safe for page size = 4k because it can support up to 2^12 p. ------------- Marked as reviewed by xliu (Committer). PR: https://git.openjdk.java.net/jdk/pull/8001 From dholmes at openjdk.java.net Mon Mar 28 22:37:45 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Mar 2022 22:37:45 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v2] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Mon, 28 Mar 2022 20:56:08 GMT, Ioi Lam wrote: >> - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` >> - Call the appropriate parsing function according to the type of the flag >> - Added more test cases for flags of the `double` type. >> >> As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. >> >> Tested with tiers 1-5. > > Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: > > - Disabled test for CompileThresholdScaling due to JDK-8283807 > - @dholmes-ora comments Thanks for the update. One nit. David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7916 From dholmes at openjdk.java.net Mon Mar 28 22:37:46 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Mar 2022 22:37:46 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v2] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Mon, 28 Mar 2022 20:57:46 GMT, Ioi Lam wrote: >> src/hotspot/share/runtime/arguments.cpp line 896: >> >>> 894: >>> 895: static bool set_fp_numeric_flag(JVMFlag* flag, const char* value, JVMFlagOrigin origin) { >>> 896: if (*value == '\0' || isspace(*value)) { >> >> Please preceded with comment >> >> // strtod allows leading whitespace, but our flag format does not. > > Fixed. Sorry I meant for the comment to precede line 896. ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From kvn at openjdk.java.net Tue Mar 29 00:10:54 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 29 Mar 2022 00:10:54 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 20:57:17 GMT, Vladimir Kozlov wrote: >> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: >> >> - Merge branch 'openjdk:master' into codecache_segments_order >> - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled >> - rename, adding test >> - moving nops out of far_jump >> - minor renaming >> - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func >> - fix name: is_non_nmethod, adding target_needs_far_branch func >> - change codecache segments order: nonprofiled-nonmethod-profiled >> increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M > > test/hotspot/jtreg/compiler/c2/aarch64/TestFarJump.java line 41: > >> 39: * @requires vm.compiler2.enabled >> 40: * >> 41: * @run driver compiler.c2.TestFarJump > > Package name `compiler.c2.aarch64` is different from `compiler.c2.TestFarJump`. > Got testing failure: > > java.lang.ClassNotFoundException: compiler.c2.TestFarJump > at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445) After I fixed test name I got next failure: [Exception Handler] 0x0000fffe40140490: b3d5 9bd2 | 4201 80d2 | a4d5 9bd2 | a5d5 9bd2 ----------System.err:(11/622)---------- java.lang.RuntimeException: ADRP instruction is expected on far jump at compiler.c2.aarch64.TestFarJump.runVM(TestFarJump.java:112) at compiler.c2.aarch64.TestFarJump.main(TestFarJump.java:126) ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From ysuenaga at openjdk.java.net Tue Mar 29 00:52:41 2022 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 29 Mar 2022 00:52:41 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 11:18:36 GMT, Christian Hagedorn wrote: >> As I said before, I think it would be nice to share DWARF parser between SA and HotSpot. Can you expose these mechanisms? It may be another RFE, and may need to think other platforms. > >> As I said before, I think it would be nice to share DWARF parser between SA and HotSpot. Can you expose these mechanisms? It may be another RFE, and may need to think other platforms. > > That would be good to have. However, I'm not familiar with the SA code and how it works to share code with HotSpot. And I'm also not sure how much overlap the two parsers actually have. I quickly skimmed through the DWARF parsing code of the SA and it seems that its main usage is for parsing call frame information (as described in section 6.4 of the DWARF 4 spec) which is not supported/needed in this patch. There is still some code that could be shared though like opening a DWARF file with its checks or reading an LEB 128 etc. Might be worth to investigate further if the two implementations can be merged/reused to some extent. But I propose to file a separate RFE for that. What do you think? @chhagedorn > There is still some code that could be shared though like opening a DWARF file with its checks or reading an LEB 128 etc. Might be worth to investigate further if the two implementations can be merged/reused to some extent. But I propose to file a separate RFE for that. What do you think? Yeah, let's investigate about it in another RFE. IMHO we can share some codes about DWARF between HotSpot and SA, and also we might need DWARF-based call frame parser in HotSpot because some 3rd-party native libraries don't use base pointer (RBP) to store SP due to optimization. In SA side, it would be useful if we can check native source file and line number in mixed jstack with your change. So I want to unify DWARF parser (processor) between HotSpot and SA, but it might be long journey... thus I agree with you to file it as another RFE. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From jbhateja at openjdk.java.net Tue Mar 29 02:44:43 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 29 Mar 2022 02:44:43 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long In-Reply-To: References: Message-ID: On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa wrote: > Implements x86 intrinsics for compare() method in java.lang.Integer and java.lang.Long. src/hotspot/cpu/x86/x86_64.ad line 12107: > 12105: instruct compareSignedI_rReg(rRegI dst, rRegI op1, rRegI op2, rRegI tmp, rFlagsReg cr) > 12106: %{ > 12107: match(Set dst (CompareSignedI op1 op2)); Please also include these patterns in x86_32.ad src/hotspot/cpu/x86/x86_64.ad line 12125: > 12123: __ movl(tmp, 0); > 12124: __ bind(done); > 12125: __ movl(dest, tmp); Please move this in macro-assembly routine. src/hotspot/cpu/x86/x86_64.ad line 12178: > 12176: __ movl(tmp, 0); > 12177: __ bind(done); > 12178: __ movl(dest, tmp); Please move this into a macro-assembly routine. src/hotspot/cpu/x86/x86_64.ad line 12204: > 12202: __ movl(tmp, 0); > 12203: __ bind(done); > 12204: __ movl(dest, tmp); Please move this into macro-assembly routine. src/hotspot/share/classfile/vmIntrinsics.hpp line 239: > 237: do_intrinsic(_compareUnsigned_i, java_lang_Integer, compare_unsigned_name, int2_int_signature, F_S) \ > 238: do_name( compare_unsigned_name, "compareUnsigned") \ > 239: do_intrinsic(_compareUnsigned_l, java_lang_Long, compare_unsigned_name, long2_int_signature, F_S) \ Creating these methods as intrinsic will create a box around the underneath comparison logic, this shall prevent any regular constant folding which could have optimized out certain control paths, I would suggest to to handle constant folding for newly added nodes in associated Value routines. src/hotspot/share/opto/comparenode.hpp line 67: > 65: CompareUnsignedLNode(Node* in1, Node* in2) : CompareNode(in1, in2) {} > 66: virtual int Opcode() const; > 67: }; Intent here seems to be to enable further auto-vectorization of newly create IR nodes. test/micro/org/openjdk/bench/java/lang/CompareInteger.java line 78: > 76: input2[i] = tmp; > 77: } > 78: } Logic re-organization suggestion:- for (int i = 0 ; i < BUFFER_SIZE; i++) { input1[i] = rng.nextLong(); } if (mode.equals("equals") { GRADIANT = 0; } else if (mode.equals("greaterThanEquals")) { GRADIANT = 1; } else { assert mode.equals("lessThanEqual"); GRADIANT = -1; } for(int i = 0 ; i < BUFFER_SIZE; i++) { input2[i] = input1[i] + i*GRADIANT; } test/micro/org/openjdk/bench/java/lang/CompareLong.java line 5: > 3: * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > 4: * > 5: * This code is free software; you can redistribute it and/or modify it We can unify this benchmark along with integer compare micro. ------------- PR: https://git.openjdk.java.net/jdk/pull/7975 From jiefu at openjdk.java.net Tue Mar 29 05:08:42 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Tue, 29 Mar 2022 05:08:42 GMT Subject: RFR: 8282162: [vector] Optimize integral vector negation API [v3] In-Reply-To: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> References: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> Message-ID: On Mon, 28 Mar 2022 09:56:22 GMT, Xiaohong Gong wrote: >> The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. >> And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. >> >> To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: >> - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. >> - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. >> >> So with this patch, the following transformations are applied: >> >> For non-masked negation with NEON: >> >> movi v16.4s, #0x0 >> sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s >> >> and with SVE: >> >> mov z16.s, #0 >> sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s >> >> For masked negation with NEON: >> >> movi v17.4s, #0x1 >> mvn v19.16b, v18.16b >> mov v20.16b, v16.16b ==> neg v18.4s, v17.4s >> bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b >> add v19.4s, v20.4s, v17.4s >> mov v18.16b, v16.16b >> bsl v18.16b, v19.16b, v20.16b >> >> and with SVE: >> >> mov z16.s, #-1 >> mov z17.s, #1 ==> neg z16.s, p0/m, z16.s >> eor z18.s, p0/m, z18.s, z16.s >> add z18.s, p0/m, z18.s, z17.s >> >> Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): >> >> NEON: >> Benchmark Gain >> Byte128Vector.NEG 1.029 >> Byte128Vector.NEGMasked 1.757 >> Short128Vector.NEG 1.041 >> Short128Vector.NEGMasked 1.659 >> Int128Vector.NEG 1.005 >> Int128Vector.NEGMasked 1.513 >> Long128Vector.NEG 1.003 >> Long128Vector.NEGMasked 1.878 >> >> SVE with 512-bits: >> Benchmark Gain >> ByteMaxVector.NEG 1.10 >> ByteMaxVector.NEGMasked 1.165 >> ShortMaxVector.NEG 1.056 >> ShortMaxVector.NEGMasked 1.195 >> IntMaxVector.NEG 1.002 >> IntMaxVector.NEGMasked 1.239 >> LongMaxVector.NEG 1.031 >> LongMaxVector.NEGMasked 1.191 >> >> X86 (non AVX-512): >> Benchmark Gain >> ByteMaxVector.NEGMasked 1.254 >> ShortMaxVector.NEGMasked 1.359 >> IntMaxVector.NEGMasked 1.431 >> LongMaxVector.NEGMasked 1.989 >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Make "degenerate_vector_integral_negate" to be "NegVI" private Obvious performance improvement had ben observed on x86 for integral vector negation. So I think it's good to go. LGTM Thanks. Note: I didn't check the aarch64 code change. ------------- Marked as reviewed by jiefu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7782 From iklam at openjdk.java.net Tue Mar 29 05:41:19 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 29 Mar 2022 05:41:19 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v3] In-Reply-To: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: > - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` > - Call the appropriate parsing function according to the type of the flag > - Added more test cases for flags of the `double` type. > > As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. > > Tested with tiers 1-5. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: moved comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7916/files - new: https://git.openjdk.java.net/jdk/pull/7916/files/0faa4cca..6fb2ccef Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7916.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7916/head:pull/7916 PR: https://git.openjdk.java.net/jdk/pull/7916 From dholmes at openjdk.java.net Tue Mar 29 05:53:47 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 29 Mar 2022 05:53:47 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v3] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: <8u78sMw_Vr7pqrI_RW4sNgu_ytfygU4GqElq2lzfIgU=.24c5fde7-3f60-4ee6-bf5d-754d351fd908@github.com> On Tue, 29 Mar 2022 05:41:19 GMT, Ioi Lam wrote: >> - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` >> - Call the appropriate parsing function according to the type of the flag >> - Added more test cases for flags of the `double` type. >> >> As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. >> >> Tested with tiers 1-5. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > moved comment Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From sspitsyn at openjdk.java.net Tue Mar 29 05:58:44 2022 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Tue, 29 Mar 2022 05:58:44 GMT Subject: RFR: 8283597: [REDO] Invalid generic signature for redefined classes In-Reply-To: <36OKI2Q3esz7ktGKR04B_uHpJgrZ1XqP_09L1FMLmbc=.7c60a665-398e-42ed-b8e3-29a568304037@github.com> References: <36OKI2Q3esz7ktGKR04B_uHpJgrZ1XqP_09L1FMLmbc=.7c60a665-398e-42ed-b8e3-29a568304037@github.com> Message-ID: <-GS3BvDDovSm2Er_OaU1O4FMgffAZWiRcfYAZ17Ssjc=.f8caaefe-3271-4adc-afa1-b6f63df928e8@github.com> On Mon, 28 Mar 2022 22:19:49 GMT, Alex Menkov wrote: > After pushing fix for JDK-8282241 (https://github.com/openjdk/jdk/pull/7676) random tests from serviceability/jvmti/RedefineClasses start to fail with > java.lang.NoClassDefFoundError: jdk/test/lib/helpers/ClassFileInstaller$Manifest > This is caused by JTReg classpath directories sharing between tests. > > Research shown that the issue was caused by using > run compile -g RedefineGenericSignatureTest.java > in the test to include additional debug info. > Actually "-g" it's not needed as the test only needs source file data and it's included by default. > > The fix is the same as previous one, the only difference is in the test: > - removed "run compile -g RedefineGenericSignatureTest.java" action; > - removed "-g" option from InMemoryJavaCompiler.compile() call. > > Tested with debug and release builds, 1500 runs without failures. > > @coleenp , @sspitsyn : could you please re-review the fix Looks good to me. It is nice you've found the root cause of this regression! Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/8007 From xgong at openjdk.java.net Tue Mar 29 06:00:43 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Tue, 29 Mar 2022 06:00:43 GMT Subject: RFR: 8282162: [vector] Optimize integral vector negation API [v3] In-Reply-To: References: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> Message-ID: <3b3zK9Dbs2d-PFUsnmAJJlmp2OddRhHEDYb5ISm8fvs=.728bf5b9-aff5-45bf-a436-f17fa7be50be@github.com> On Tue, 29 Mar 2022 05:05:43 GMT, Jie Fu wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Make "degenerate_vector_integral_negate" to be "NegVI" private > > Note: I didn't check the aarch64 code change. Thanks for the review @DamonFool ! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From duke at openjdk.java.net Tue Mar 29 06:59:42 2022 From: duke at openjdk.java.net (KIRIYAMA Takuya) Date: Tue, 29 Mar 2022 06:59:42 GMT Subject: RFR: 8280761: UseCompressedOops should be set after limit_heap_by_allocatable_memory In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 06:52:34 GMT, KIRIYAMA Takuya wrote: > I fixed to set UseCompressedOops flag after limit_heap_by_allocatable_memory(). > So when ulimit -v is called and -XX:MaxRAM is set, UseCompressedOops does not become false. > And all hotspot tier1 test are passed. > Would you please review this fix? Could anyone review this pull request, please? ------------- PR: https://git.openjdk.java.net/jdk/pull/7938 From dholmes at openjdk.java.net Tue Mar 29 07:06:38 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 29 Mar 2022 07:06:38 GMT Subject: RFR: 8280761: UseCompressedOops should be set after limit_heap_by_allocatable_memory In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 06:52:34 GMT, KIRIYAMA Takuya wrote: > I fixed to set UseCompressedOops flag after limit_heap_by_allocatable_memory(). > So when ulimit -v is called and -XX:MaxRAM is set, UseCompressedOops does not become false. > And all hotspot tier1 test are passed. > Would you please review this fix? I cc'd hotspot-gc as they should probably be the ones to look at this for you. ------------- PR: https://git.openjdk.java.net/jdk/pull/7938 From stuefe at openjdk.java.net Tue Mar 29 07:26:48 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Mar 2022 07:26:48 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Mar 2022 22:25:01 GMT, David Holmes wrote: >> src/hotspot/share/utilities/elfFile.cpp line 450: >> >>> 448: if (buf == nullptr) { >>> 449: return false; >>> 450: } >> >> I'd move this close to and local to where it is used. >> >> Also, you seem to repeat the same pattern a lot "NEW_RESOURCE_ARRAY(n), if error return something". I'd factor this out to an utility function or utility macro, maybe one where you pass the error return value as macro parameter. > > Thomas's comment caught my attention in the email. NEW_RESOURCE_ARRAY aborts the VM on OOM. Use NEW_RESOURCE_ARRAY_RETURN_NULL if you want to continue. As I wrote in another comment, I'd rather we avoid RA altogether since it relies on Thread::current(), and we want to see callstacks even with Thread::current==NULL. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From aph at openjdk.java.net Tue Mar 29 10:17:47 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 29 Mar 2022 10:17:47 GMT Subject: RFR: 8281469: aarch64: Improve interpreter stack banging In-Reply-To: References: <7dK0Nt08VRU-Hygxo6QR6URyW9grnMxoW2SJ-jSh3Cc=.c639777e-10b4-495f-9188-30377b6bf9cd@github.com> Message-ID: On Mon, 28 Mar 2022 22:33:08 GMT, Xin Liu wrote: > LGTM. I am not a reviwer. need other reviewer to approve it. > > One subtlety is that sub can only encode uimm24. I think it's safe for page size = 4k because it can support up to 2^12 p. It'll be fine. See `MacroAssembler::sub(Register Rd, Register Rn, RegisterOrConstant decrement)` ------------- PR: https://git.openjdk.java.net/jdk/pull/8001 From tschatzl at openjdk.java.net Tue Mar 29 12:47:53 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 29 Mar 2022 12:47:53 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 08:52:41 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? > > This relates to this code snippet: > > int xmm_bypass_limit = FrameMap::nof_xmm_regs; > #ifdef _LP64 > if (UseAVX < 3) { > xmm_bypass_limit = xmm_bypass_limit / 2; > } > #endif > > > Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. > > I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. > > This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). > > Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. > > Thanks, > Thomas Ping? ------------- PR: https://git.openjdk.java.net/jdk/pull/7917 From kvn at openjdk.java.net Tue Mar 29 17:34:58 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 29 Mar 2022 17:34:58 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 13:36:27 GMT, Boris Ulasevich wrote: >> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. Currently only the aarch64 backend is adapted to make use of these changes. >> >> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB. >> >> As a side effect, the performance of some tests is slightly improved: >> ``ArraysFill.testCharFill 10 thrpt 15 170235.720 -> 178477.212 ops/ms`` >> >> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'openjdk:master' into codecache_segments_order > - review findings: use instruction_size istead of raw constant, strengthen the assert, check alignment, move comments, segments order: profiled - non_method - non_profiled > - rename, adding test > - moving nops out of far_jump > - minor renaming > - review comments. remove far_call limit. undo trampoline-to-farcall. add trampoline_needs_far_jump func > - fix name: is_non_nmethod, adding target_needs_far_branch func > - change codecache segments order: nonprofiled-nonmethod-profiled > increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M My performance testing shows that there is regression in some startup benchmarks on aarch64. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From simonis at openjdk.java.net Tue Mar 29 17:56:45 2022 From: simonis at openjdk.java.net (Volker Simonis) Date: Tue, 29 Mar 2022 17:56:45 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Tue, 29 Mar 2022 17:31:06 GMT, Vladimir Kozlov wrote: > My performance testing shows that there is regression in some startup benchmarks on aarch64. Is there a chance to share what you are using as "startup benchmarks"? I think they would be definitely useful for others as well if they don't contain proprietary code. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From psandoz at openjdk.java.net Tue Mar 29 18:08:43 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 29 Mar 2022 18:08:43 GMT Subject: RFR: 8282162: [vector] Optimize integral vector negation API [v3] In-Reply-To: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> References: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> Message-ID: On Mon, 28 Mar 2022 09:56:22 GMT, Xiaohong Gong wrote: >> The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. >> And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. >> >> To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: >> - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. >> - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. >> >> So with this patch, the following transformations are applied: >> >> For non-masked negation with NEON: >> >> movi v16.4s, #0x0 >> sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s >> >> and with SVE: >> >> mov z16.s, #0 >> sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s >> >> For masked negation with NEON: >> >> movi v17.4s, #0x1 >> mvn v19.16b, v18.16b >> mov v20.16b, v16.16b ==> neg v18.4s, v17.4s >> bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b >> add v19.4s, v20.4s, v17.4s >> mov v18.16b, v16.16b >> bsl v18.16b, v19.16b, v20.16b >> >> and with SVE: >> >> mov z16.s, #-1 >> mov z17.s, #1 ==> neg z16.s, p0/m, z16.s >> eor z18.s, p0/m, z18.s, z16.s >> add z18.s, p0/m, z18.s, z17.s >> >> Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): >> >> NEON: >> Benchmark Gain >> Byte128Vector.NEG 1.029 >> Byte128Vector.NEGMasked 1.757 >> Short128Vector.NEG 1.041 >> Short128Vector.NEGMasked 1.659 >> Int128Vector.NEG 1.005 >> Int128Vector.NEGMasked 1.513 >> Long128Vector.NEG 1.003 >> Long128Vector.NEGMasked 1.878 >> >> SVE with 512-bits: >> Benchmark Gain >> ByteMaxVector.NEG 1.10 >> ByteMaxVector.NEGMasked 1.165 >> ShortMaxVector.NEG 1.056 >> ShortMaxVector.NEGMasked 1.195 >> IntMaxVector.NEG 1.002 >> IntMaxVector.NEGMasked 1.239 >> LongMaxVector.NEG 1.031 >> LongMaxVector.NEGMasked 1.191 >> >> X86 (non AVX-512): >> Benchmark Gain >> ByteMaxVector.NEGMasked 1.254 >> ShortMaxVector.NEGMasked 1.359 >> IntMaxVector.NEGMasked 1.431 >> LongMaxVector.NEGMasked 1.989 >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Make "degenerate_vector_integral_negate" to be "NegVI" private Java changes are good. ------------- Marked as reviewed by psandoz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7782 From mcimadamore at openjdk.java.net Tue Mar 29 18:13:05 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 29 Mar 2022 18:13:05 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v12] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 27 additional commits since the last revision: - Use thread local storage to optimize attach of async threads - Drop support for Constable from MemoryLayout/FunctionDescriptor - Merge branch 'master' into foreign-preview - Revert changes to RunTests.gmk - Add --enable-preview to micro benchmark java options - Address more review comments - Update src/java.base/share/classes/java/lang/foreign/SymbolLookup.java Co-authored-by: Jorn Vernee - Update src/java.base/share/classes/java/lang/foreign/SymbolLookup.java Co-authored-by: Jorn Vernee - Address review comments - Update src/java.base/share/classes/java/lang/foreign/MemorySegment.java Co-authored-by: Jorn Vernee - ... and 17 more: https://git.openjdk.java.net/jdk/compare/02333d66...55aee872 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/504b564a..55aee872 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=10-11 Stats: 99257 lines in 1550 files changed: 79659 ins; 15544 del; 4054 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Tue Mar 29 18:23:41 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 29 Mar 2022 18:23:41 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v13] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Switch to daemon threads for async upcalls ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/55aee872..43dc6be3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From kvn at openjdk.java.net Tue Mar 29 18:52:59 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 29 Mar 2022 18:52:59 GMT Subject: RFR: 8280872: Reorder code cache segments to improve code density [v8] In-Reply-To: References: Message-ID: On Tue, 29 Mar 2022 17:53:17 GMT, Volker Simonis wrote: > > My performance testing shows that there is regression in some startup benchmarks on aarch64. > > Is there a chance to share what you are using as "startup benchmarks"? I think they would be definitely useful for others as well if they don't contain proprietary code. I would defer this question to @ericcaspole and @cl4es. I don't know how these benchmarks run and which one are open. But I agree that we can share how we run open startup benchmarks with community if we did not do it already. ------------- PR: https://git.openjdk.java.net/jdk/pull/7517 From kvn at openjdk.java.net Tue Mar 29 20:27:56 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 29 Mar 2022 20:27:56 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v6] In-Reply-To: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> References: <4f4q-PLj6psH50mRQCVLAX8bMjzF4XWzAornt_t4PNE=.4331b19d-469b-4e0d-8a1f-d1eeb5aaf9ed@github.com> Message-ID: On Mon, 14 Mar 2022 06:13:30 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Update a few comments Nice work. I have few comments. src/hotspot/share/opto/superword.cpp line 2754: > 2752: swptrs.append(mem_p); > 2753: } > 2754: assert(is_java_primitive(bt), "only primitive types are allowed in post loop"); Should this be check to bailout instead of assert? Before `(n->is_Mem())` check. src/hotspot/share/opto/superword.cpp line 2774: > 2772: if (unique_size * vlen != MaxVectorSize) { > 2773: return NULL; > 2774: } Why not smaller than MaxVectorSize? I don't think SuperWord guarantee that vectors in main loop are equal to MaxVectorSize. Or this is simply restriction to simplify optimization? src/hotspot/share/opto/superword.cpp line 2791: > 2789: // vector drain loop which is cloned from main loop before super-unrolling > 2790: // so the scalar post loop runs at most vlen-1 trips. Hence, this version > 2791: // only runs at most 1 iteration after vector mask transformation. Where is the check that `cl->is_rce_post_loop()` runs only `vlen-1` trips? src/hotspot/share/opto/superword.cpp line 2806: > 2804: Node* offset = new ConvI2LNode(trip_cnt); > 2805: _igvn.register_new_node_with_optimizer(offset); > 2806: Node* vmask = VectorMaskGenNode::make(offset, vmask_bt); I think `offset` should be `length` here. src/hotspot/share/opto/superword.hpp line 249: > 247: // -------------------------VectorLaneSizeStats------------------------- > 248: // Vector lane size statistics for loop vectorization with vector masks > 249: class VectorLaneSizeStats { Can you rename `Lane` to `Element` which what we use in other places in HotSpot? ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From duke at openjdk.java.net Tue Mar 29 22:07:49 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 29 Mar 2022 22:07:49 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long In-Reply-To: References: Message-ID: On Sun, 27 Mar 2022 06:57:58 GMT, Quan Anh Mai wrote: > This is both complicated and inefficient, I would suggest building the intrinsic in the IR graph so that the compiler can simplify `Integer.compareUnsigned(x, y) < 0` into `x u< y`. Thanks. Thank you for the suggestion! This intrinsic uses 1 cmp instruction instead of two and shows ~10% improvement due to better branch prediction. Even without the intrinsic, the compiler is currently able to reduce it to x u< y but is still generating two cmp (unsigned) instructions as Integer.compareUnsigned(x, y) is implemented as x u< y? -1 : (x ==y ? 0 : 1). ------------- PR: https://git.openjdk.java.net/jdk/pull/7975 From duke at openjdk.java.net Tue Mar 29 22:07:52 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 29 Mar 2022 22:07:52 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long In-Reply-To: References: Message-ID: On Tue, 29 Mar 2022 02:24:21 GMT, Jatin Bhateja wrote: >> Implements x86 intrinsics for compare() method in java.lang.Integer and java.lang.Long. > > src/hotspot/cpu/x86/x86_64.ad line 12107: > >> 12105: instruct compareSignedI_rReg(rRegI dst, rRegI op1, rRegI op2, rRegI tmp, rFlagsReg cr) >> 12106: %{ >> 12107: match(Set dst (CompareSignedI op1 op2)); > > Please also include these patterns in x86_32.ad Will update x86_32.ad as well. > src/hotspot/cpu/x86/x86_64.ad line 12125: > >> 12123: __ movl(tmp, 0); >> 12124: __ bind(done); >> 12125: __ movl(dest, tmp); > > Please move this in macro-assembly routine. Sure, will refactor it into a macro-assembly > src/hotspot/cpu/x86/x86_64.ad line 12178: > >> 12176: __ movl(tmp, 0); >> 12177: __ bind(done); >> 12178: __ movl(dest, tmp); > > Please move this into a macro-assembly routine. Sure, will do that and update it soon. > src/hotspot/cpu/x86/x86_64.ad line 12204: > >> 12202: __ movl(tmp, 0); >> 12203: __ bind(done); >> 12204: __ movl(dest, tmp); > > Please move this into macro-assembly routine. Sure, will do that and update it soon. > src/hotspot/share/opto/comparenode.hpp line 67: > >> 65: CompareUnsignedLNode(Node* in1, Node* in2) : CompareNode(in1, in2) {} >> 66: virtual int Opcode() const; >> 67: }; > > Intent here seems to be to enable further auto-vectorization of newly create IR nodes. Yes, that is the intention. > test/micro/org/openjdk/bench/java/lang/CompareInteger.java line 78: > >> 76: input2[i] = tmp; >> 77: } >> 78: } > > Logic re-organization suggestion:- > > > for (int i = 0 ; i < BUFFER_SIZE; i++) { > input1[i] = rng.nextLong(); > } > > if (mode.equals("equals") { > GRADIANT = 0; > } else if (mode.equals("greaterThanEquals")) { > GRADIANT = 1; > } else { > assert mode.equals("lessThanEqual"); > GRADIANT = -1; > } > > for(int i = 0 ; i < BUFFER_SIZE; i++) { > input2[i] = input1[i] + i*GRADIANT; > } The suggested refactoring is definitely elegant but one rare possibility is overflow due to the addition/subtraction. The swap logic doesn't have that problem. > test/micro/org/openjdk/bench/java/lang/CompareLong.java line 5: > >> 3: * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> 4: * >> 5: * This code is free software; you can redistribute it and/or modify it > > We can unify this benchmark along with integer compare micro. Sure, will do the unification. ------------- PR: https://git.openjdk.java.net/jdk/pull/7975 From dlong at openjdk.java.net Tue Mar 29 22:15:43 2022 From: dlong at openjdk.java.net (Dean Long) Date: Tue, 29 Mar 2022 22:15:43 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 08:52:41 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? > > This relates to this code snippet: > > int xmm_bypass_limit = FrameMap::nof_xmm_regs; > #ifdef _LP64 > if (UseAVX < 3) { > xmm_bypass_limit = xmm_bypass_limit / 2; > } > #endif > > > Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. > > I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. > > This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). > > Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. > > Thanks, > Thomas LGTM. ------------- Marked as reviewed by dlong (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7917 From kvn at openjdk.java.net Tue Mar 29 23:07:18 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 29 Mar 2022 23:07:18 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 08:52:41 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? > > This relates to this code snippet: > > int xmm_bypass_limit = FrameMap::nof_xmm_regs; > #ifdef _LP64 > if (UseAVX < 3) { > xmm_bypass_limit = xmm_bypass_limit / 2; > } > #endif > > > Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. > > I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. > > This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). > > Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. > > Thanks, > Thomas I thought I commented on this already but I can't find record. Looks good. I have only one comment. src/hotspot/cpu/x86/register_x86.hpp line 169: > 167: // Actually available XMM registers for use, depending on actual CPU capabilities > 168: // and flags. > 169: static int available_xmm_registers(); Why not define function's body here? ------------- PR: https://git.openjdk.java.net/jdk/pull/7917 From duke at openjdk.java.net Tue Mar 29 23:30:38 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 29 Mar 2022 23:30:38 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long In-Reply-To: References: Message-ID: <-6o83x73qUUHraAA9swfhdp-G8PGu9xOvVTLGeOcGtI=.b8db34be-79fd-4a7d-94e6-6a44fd2a4892@github.com> On Tue, 29 Mar 2022 21:56:18 GMT, Vamsi Parasa wrote: >> This is both complicated and inefficient, I would suggest building the intrinsic in the IR graph so that the compiler can simplify `Integer.compareUnsigned(x, y) < 0` into `x u< y`. Thanks. > >> This is both complicated and inefficient, I would suggest building the intrinsic in the IR graph so that the compiler can simplify `Integer.compareUnsigned(x, y) < 0` into `x u< y`. Thanks. > > Thank you for the suggestion! This intrinsic uses 1 cmp instruction instead of two and shows ~10% improvement due to better branch prediction. Even without the intrinsic, the compiler is currently able to reduce it to x u< y but is still generating two cmp (unsigned) instructions as Integer.compareUnsigned(x, y) is implemented as x u< y? -1 : (x ==y ? 0 : 1). @vamsi-parasa But normally the result of the `compare` methods is not used as a raw integer (it is useless to do so since the methods do not have any promise regarding the value of the result, only its sign). The idiom is to compare the result with 0, such as `Integer.compare(x, y) > 0`, the compiler can reduce this to `x > y` (last time I checked it does not do so but in principle this is possible). Your intrinsic prevents the compiler to do such transformations, hurting the performance of real programs. > Even without the intrinsic, the compiler is currently able to reduce it to x u< y It is because the compiler can recognise the pattern `x + MIN_VALUE < y + MIN_VALUE` and transforms it into `x u< y`. This transformation is fragile however if one of the arguments is in the form `x + con`, in such cases constant propagation may lead to slight deviations from recognised patterns, defeat the transformations. As a result, it may be justifiable to have a dedicated intrinsic for that since unsigned comparisons are pretty basic operations that are needed for optimal range check performance. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7975 From njian at openjdk.java.net Wed Mar 30 01:24:45 2022 From: njian at openjdk.java.net (Ningsheng Jian) Date: Wed, 30 Mar 2022 01:24:45 GMT Subject: RFR: 8282162: [vector] Optimize integral vector negation API [v3] In-Reply-To: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> References: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> Message-ID: <7Vq50g457PgQqxorybsnpP2N4frte8_-AKtJCHY5OdU=.cdf5b05c-fc17-4c63-b522-8866aa3bceb9@github.com> On Mon, 28 Mar 2022 09:56:22 GMT, Xiaohong Gong wrote: >> The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. >> And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. >> >> To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: >> - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. >> - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. >> >> So with this patch, the following transformations are applied: >> >> For non-masked negation with NEON: >> >> movi v16.4s, #0x0 >> sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s >> >> and with SVE: >> >> mov z16.s, #0 >> sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s >> >> For masked negation with NEON: >> >> movi v17.4s, #0x1 >> mvn v19.16b, v18.16b >> mov v20.16b, v16.16b ==> neg v18.4s, v17.4s >> bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b >> add v19.4s, v20.4s, v17.4s >> mov v18.16b, v16.16b >> bsl v18.16b, v19.16b, v20.16b >> >> and with SVE: >> >> mov z16.s, #-1 >> mov z17.s, #1 ==> neg z16.s, p0/m, z16.s >> eor z18.s, p0/m, z18.s, z16.s >> add z18.s, p0/m, z18.s, z17.s >> >> Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): >> >> NEON: >> Benchmark Gain >> Byte128Vector.NEG 1.029 >> Byte128Vector.NEGMasked 1.757 >> Short128Vector.NEG 1.041 >> Short128Vector.NEGMasked 1.659 >> Int128Vector.NEG 1.005 >> Int128Vector.NEGMasked 1.513 >> Long128Vector.NEG 1.003 >> Long128Vector.NEGMasked 1.878 >> >> SVE with 512-bits: >> Benchmark Gain >> ByteMaxVector.NEG 1.10 >> ByteMaxVector.NEGMasked 1.165 >> ShortMaxVector.NEG 1.056 >> ShortMaxVector.NEGMasked 1.195 >> IntMaxVector.NEG 1.002 >> IntMaxVector.NEGMasked 1.239 >> LongMaxVector.NEG 1.031 >> LongMaxVector.NEGMasked 1.191 >> >> X86 (non AVX-512): >> Benchmark Gain >> ByteMaxVector.NEGMasked 1.254 >> ShortMaxVector.NEGMasked 1.359 >> IntMaxVector.NEGMasked 1.431 >> LongMaxVector.NEGMasked 1.989 >> >> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 >> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Make "degenerate_vector_integral_negate" to be "NegVI" private AArch64 changes look good to me. ------------- Marked as reviewed by njian (Committer). PR: https://git.openjdk.java.net/jdk/pull/7782 From xgong at openjdk.java.net Wed Mar 30 01:31:39 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Wed, 30 Mar 2022 01:31:39 GMT Subject: RFR: 8282162: [vector] Optimize integral vector negation API [v3] In-Reply-To: References: <-sL2chUr7o8qwgvCm4qS_teHYtUu9rVGUa2GFWb9yC8=.14b0f006-56dc-4245-9746-ff248478e9ed@github.com> Message-ID: On Tue, 29 Mar 2022 18:05:56 GMT, Paul Sandoz wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Make "degenerate_vector_integral_negate" to be "NegVI" private > > Java changes are good. Thanks for the review @PaulSandoz @nsjian ! ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From xgong at openjdk.java.net Wed Mar 30 01:42:41 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Wed, 30 Mar 2022 01:42:41 GMT Subject: Integrated: 8282162: [vector] Optimize integral vector negation API In-Reply-To: References: Message-ID: <5MWaq1vZaLt2tM_0lFQ_QJcHxe5iWZ_VTNFYsUmmGm4=.1b73fdb1-2450-4419-ab63-fe851659cbd3@github.com> On Fri, 11 Mar 2022 06:29:22 GMT, Xiaohong Gong wrote: > The current vector `"NEG"` is implemented with substraction a vector by zero in case the architecture does not support the negation instruction. And to fit the predicate feature for architectures that support it, the masked vector `"NEG" ` is implemented with pattern `"v.not(m).add(1, m)"`. They both can be optimized to a single negation instruction for ARM SVE. > And so does the non-masked "NEG" for NEON. Besides, implementing the masked "NEG" with substraction for architectures that support neither negation instruction nor predicate feature can also save several instructions than the current pattern. > > To optimize the VectorAPI negation, this patch moves the implementation from Java side to hotspot. The compiler will generate different nodes according to the architecture: > - Generate the (predicated) negation node if architecture supports it, otherwise, generate "`zero.sub(v)`" pattern for non-masked operation. > - Generate `"zero.sub(v, m)"` for masked operation if the architecture does not have predicate feature, otherwise generate the original pattern `"v.xor(-1, m).add(1, m)"`. > > So with this patch, the following transformations are applied: > > For non-masked negation with NEON: > > movi v16.4s, #0x0 > sub v17.4s, v16.4s, v17.4s ==> neg v17.4s, v17.4s > > and with SVE: > > mov z16.s, #0 > sub z18.s, z16.s, z17.s ==> neg z16.s, p7/m, z16.s > > For masked negation with NEON: > > movi v17.4s, #0x1 > mvn v19.16b, v18.16b > mov v20.16b, v16.16b ==> neg v18.4s, v17.4s > bsl v20.16b, v19.16b, v18.16b bsl v19.16b, v18.16b, v17.16b > add v19.4s, v20.4s, v17.4s > mov v18.16b, v16.16b > bsl v18.16b, v19.16b, v20.16b > > and with SVE: > > mov z16.s, #-1 > mov z17.s, #1 ==> neg z16.s, p0/m, z16.s > eor z18.s, p0/m, z18.s, z16.s > add z18.s, p0/m, z18.s, z17.s > > Here are the performance gains for benchmarks (see [1][2]) on ARM and x86 machines(note that the non-masked negation benchmarks do not have any improvement on X86 since no instructions are changed): > > NEON: > Benchmark Gain > Byte128Vector.NEG 1.029 > Byte128Vector.NEGMasked 1.757 > Short128Vector.NEG 1.041 > Short128Vector.NEGMasked 1.659 > Int128Vector.NEG 1.005 > Int128Vector.NEGMasked 1.513 > Long128Vector.NEG 1.003 > Long128Vector.NEGMasked 1.878 > > SVE with 512-bits: > Benchmark Gain > ByteMaxVector.NEG 1.10 > ByteMaxVector.NEGMasked 1.165 > ShortMaxVector.NEG 1.056 > ShortMaxVector.NEGMasked 1.195 > IntMaxVector.NEG 1.002 > IntMaxVector.NEGMasked 1.239 > LongMaxVector.NEG 1.031 > LongMaxVector.NEGMasked 1.191 > > X86 (non AVX-512): > Benchmark Gain > ByteMaxVector.NEGMasked 1.254 > ShortMaxVector.NEGMasked 1.359 > IntMaxVector.NEGMasked 1.431 > LongMaxVector.NEGMasked 1.989 > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1881 > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/Byte128Vector.java#L1896 This pull request has now been integrated. Changeset: d0668568 Author: Xiaohong Gong Committer: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/d06685680c17583d56dc3d788d9a2ecea8812bc8 Stats: 325 lines in 15 files changed: 275 ins; 25 del; 25 mod 8282162: [vector] Optimize integral vector negation API Reviewed-by: jiefu, psandoz, njian ------------- PR: https://git.openjdk.java.net/jdk/pull/7782 From fyang at openjdk.java.net Wed Mar 30 02:35:06 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 30 Mar 2022 02:35:06 GMT Subject: RFR: 8283907: Fix Huawei copyright in various files Message-ID: Please review this trivial fix that adds a missing comma for the company name. Thanks, Felix ------------- Commit messages: - 8283907: Fix Huawei copyright in various files Changes: https://git.openjdk.java.net/jdk/pull/8029/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8029&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283907 Stats: 12 lines in 12 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/8029.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8029/head:pull/8029 PR: https://git.openjdk.java.net/jdk/pull/8029 From mli at openjdk.java.net Wed Mar 30 02:44:38 2022 From: mli at openjdk.java.net (Hamlin Li) Date: Wed, 30 Mar 2022 02:44:38 GMT Subject: RFR: 8283907: Fix Huawei copyright in various files In-Reply-To: References: Message-ID: On Wed, 30 Mar 2022 02:27:27 GMT, Fei Yang wrote: > Please review this trivial fix that adds a missing comma for the company name. > > Thanks, > Felix looks good and trivial ------------- Marked as reviewed by mli (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/8029 From stuefe at openjdk.java.net Wed Mar 30 04:46:50 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Mar 2022 04:46:50 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java line 118: > 116: runAndCheck(new Flags(TestDwarf.class.getCanonicalName(), "nativeDereferenceNull"), > 117: new DwarfConstraint(0, "dereference_null", "libTestDwarfHelper.h", 44)); > 118: } Can you please pass `-XX:-CreateCoredumpOnCrash` on spawned sub processes? ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From stuefe at openjdk.java.net Wed Mar 30 05:13:48 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Mar 2022 05:13:48 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: On Mon, 28 Feb 2022 16:22:25 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits: > > - Updating some comments > - Cleanup loading dwarf file and add summary > - Review comments of first pass by Thomas except dwarf file loading > - Merge branch 'master' into JDK-8242181 > - Make dwarf tag NOT_PRODUCT > - Change log_* to log_develop_* and log_warning to log_develop_info > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Better formatting of trace output > - some code move and more cleanups > - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841 We see test errors on Linux ppcle and x64 in gtests: [ RUN ] os_linux.decoder_get_source_info_valid_vm ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From chagedorn at openjdk.java.net Wed Mar 30 06:41:36 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 30 Mar 2022 06:41:36 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: References: Message-ID: <7N3Qtv4nbf3PYJYI5ywgDWIUHB_8EGGTgCTc4tJxua4=.183f3c4a-1437-4e8f-9c02-dd00c123ac29@github.com> On Mon, 28 Mar 2022 13:14:50 GMT, Thomas Stuefe wrote: > this is impressive work. It's a big change, and I had a look at part of it. I'll continue tomorrow. Thanks a lot Thomas for your careful review! I'm in the process of working through your comments and will come back with an update today or later this week. > In general, I'm concerned with the use of both UL and ResourceArea in this code. I know the use of UL has been discussed, but still. I agree that it is problematic but I think it would be good to keep some logging around when later coming back to the parser code (and that's the only reason I think that you ever want to turn these logs on). I can currently think of two options: - Leave UL in and just guard it with an additional new develop flag to exclude the logs from unfiltered UL logging. This would allow us to kinda accept the risks for debugging purposes. That's not really a good design though but we could keep the log levels with their time stamps. - Replace all UL calls with `tty` and also guard them with a new develop flag and play around with `Verbose` and `WizardMode` to keep the different log levels. That's not great either but I think it's safer to use and we only want the logs on rare occasions anyways - so it might be acceptable to use these verbose flags even though we should generally get away from them. > Use of RA will prevent us from getting useful callstacks if we crash and Thread::current is NULL or invalid. I'd feel better if we were to consistently rely on an outside scratch buffer (like we usually do in error reporting). Even raw ::malloc would be better IMHO. The idea of a scratch buffer sounds good. I'll check if I can replace all the `NEW_RESOURCE_ARRAY` usages with it. > Another concern was safety, since this is a potential attack vector with manipulated Dwarf files, if someone manages to provove a crash. Maybe far fetched, but still. Would be good to get SonarCloud readings for this code e.g. I was also concerned about that and I'm very thankful that you've spotted some issues already! I think minimizing the risk of a potential attack should be a top priority. We should definitely add some more checks. What do you think about the usage of `_JVM_DWARF_PATH` to load a DWARF file? I'm not sure how safe it is. I originally had it enabled for debug builds only. > We see test errors on Linux ppcle and x64 in gtests: Could you try running it with `-Xlog:dwarf=info/debug` in order to find out why it failed? It might not have found the symbols. Is the JTreg test `TestDwarf.java` working? But there is now another problem that since using GCC 11.2 (change done for Oracle builds with [JDK-8283057](https://bugs.openjdk.java.net/browse/JDK-8283057)), it emits unsupported DWARF 5 for some DWARF sections, at least on my machine, which is unfortunate. Maybe that's also the reason you see the failures if you use GCC 11.2. Maybe we can mitigate this problem by forcing GCC to use DWARF 4 for now. Could that be done by using the `-gdwarf` GCC flag? @erikj79 > We also see Problems in runtime/ErrorHandling and in jfr/jvm/TestDumpOnCrash. Mostly, these tests now have much longer runtimes (about factor 2). With TestDumpOnCrash, both the error file writer and the test itself timeouted on some of our slower machines. Are these timeouts on ppcle and x64? We could also try to add `-Xlog:dwarf=info/debug` to the runs to get some rough idea of the time required to parse DWARF. I'll have a look at the these tests. Thanks, Christian ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From chagedorn at openjdk.java.net Wed Mar 30 06:41:36 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 30 Mar 2022 06:41:36 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v5] In-Reply-To: <7N3Qtv4nbf3PYJYI5ywgDWIUHB_8EGGTgCTc4tJxua4=.183f3c4a-1437-4e8f-9c02-dd00c123ac29@github.com> References: <7N3Qtv4nbf3PYJYI5ywgDWIUHB_8EGGTgCTc4tJxua4=.183f3c4a-1437-4e8f-9c02-dd00c123ac29@github.com> Message-ID: On Wed, 30 Mar 2022 06:35:27 GMT, Christian Hagedorn wrote: >> Hi Christian, >> >> this is impressive work. It's a big change, and I had a look at part of it. I'll continue tomorrow. >> >> In general, I'm concerned with the use of both UL and ResourceArea in this code. I know the use of UL has been discussed, but still. Use of RA will prevent us from getting useful callstacks if we crash and Thread::current is NULL or invalid. I'd feel better if we were to consistently rely on an outside scratch buffer (like we usually do in error reporting). Even raw ::malloc would be better IMHO. >> >> Another concern was safety, since this is a potential attack vector with manipulated Dwarf files, if someone manages to provove a crash. Maybe far fetched, but still. Would be good to get SonarCloud readings for this code e.g. >> >> More remarks inline. >> >> Cheers, Thomas > >> this is impressive work. It's a big change, and I had a look at part of it. I'll continue tomorrow. > > Thanks a lot Thomas for your careful review! I'm in the process of working through your comments and will come back with an update today or later this week. > >> In general, I'm concerned with the use of both UL and ResourceArea in this code. I know the use of UL has been discussed, but still. > > I agree that it is problematic but I think it would be good to keep some logging around when later coming back to the parser code (and that's the only reason I think that you ever want to turn these logs on). I can currently think of two options: > > - Leave UL in and just guard it with an additional new develop flag to exclude the logs from unfiltered UL logging. This would allow us to kinda accept the risks for debugging purposes. That's not really a good design though but we could keep the log levels with their time stamps. > - Replace all UL calls with `tty` and also guard them with a new develop flag and play around with `Verbose` and `WizardMode` to keep the different log levels. That's not great either but I think it's safer to use and we only want the logs on rare occasions anyways - so it might be acceptable to use these verbose flags even though we should generally get away from them. > >> Use of RA will prevent us from getting useful callstacks if we crash and Thread::current is NULL or invalid. I'd feel better if we were to consistently rely on an outside scratch buffer (like we usually do in error reporting). Even raw ::malloc would be better IMHO. > > The idea of a scratch buffer sounds good. I'll check if I can replace all the `NEW_RESOURCE_ARRAY` usages with it. > >> Another concern was safety, since this is a potential attack vector with manipulated Dwarf files, if someone manages to provove a crash. Maybe far fetched, but still. Would be good to get SonarCloud readings for this code e.g. > > I was also concerned about that and I'm very thankful that you've spotted some issues already! I think minimizing the risk of a potential attack should be a top priority. We should definitely add some more checks. What do you think about the usage of `_JVM_DWARF_PATH` to load a DWARF file? I'm not sure how safe it is. I originally had it enabled for debug builds only. > >> We see test errors on Linux ppcle and x64 in gtests: > > Could you try running it with `-Xlog:dwarf=info/debug` in order to find out why it failed? It might not have found the symbols. Is the JTreg test `TestDwarf.java` working? But there is now another problem that since using GCC 11.2 (change done for Oracle builds with [JDK-8283057](https://bugs.openjdk.java.net/browse/JDK-8283057)), it emits unsupported DWARF 5 for some DWARF sections, at least on my machine, which is unfortunate. Maybe that's also the reason you see the failures if you use GCC 11.2. Maybe we can mitigate this problem by forcing GCC to use DWARF 4 for now. Could that be done by using the `-gdwarf` GCC flag? @erikj79 > >> We also see Problems in runtime/ErrorHandling and in jfr/jvm/TestDumpOnCrash. Mostly, these tests now have much longer runtimes (about factor 2). With TestDumpOnCrash, both the error file writer and the test itself timeouted on some of our slower machines. > > Are these timeouts on ppcle and x64? We could also try to add `-Xlog:dwarf=info/debug` to the runs to get some rough idea of the time required to parse DWARF. I'll have a look at the these tests. > > Thanks, > Christian > @chhagedorn > > > There is still some code that could be shared though like opening a DWARF file with its checks or reading an LEB 128 etc. Might be worth to investigate further if the two implementations can be merged/reused to some extent. But I propose to file a separate RFE for that. What do you think? > > Yeah, let's investigate about it in another RFE. > > IMHO we can share some codes about DWARF between HotSpot and SA, and also we might need DWARF-based call frame parser in HotSpot because some 3rd-party native libraries don't use base pointer (RBP) to store SP due to optimization. In SA side, it would be useful if we can check native source file and line number in mixed jstack with your change. I see, then it makes sense to unify these parsers later. > > So I want to unify DWARF parser (processor) between HotSpot and SA, but it might be long journey... thus I agree with you to file it as another RFE. Sounds good, I'll file an RFE and link it to this RFE. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From fjiang at openjdk.java.net Wed Mar 30 07:09:59 2022 From: fjiang at openjdk.java.net (Feilong Jiang) Date: Wed, 30 Mar 2022 07:09:59 GMT Subject: RFR: JDK-8283865: riscv: Break down -XX:+UseRVB into seperate options for each bitmanip extension Message-ID: Currently openjdk riscv supports RISC-V bitmanip extension as a bundle while spec provides four individual extensions: Zb[abcs][1]. According to the spec, we need to break down `UseRVB` into two individual options `UseZba` and `UseZbb` to enable or disable Zba and Zbb respectively (openjdk riscv only supports Zba and Zbb for now). Since multi-letter extensions representation in the ISA bitmap is still not determined [2][3], availability for those extensions could not be queried from HWCAP. Feature detection of Zba and Zbb was removed temporarily. Linux RISCV64 release hotspot/jdk tier1 tests are passed on QEMU with following options: - [x] +UseZba && +UseZbb - [x] +UseZba && -UseZbb - [x] -UseZba && +UseZbb [1]: https://github.com/riscv/riscv-bitmanip/releases/download/1.0.0/bitmanip-1.0.0-38-g865e7a7.pdf [2]: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010250.html [3]: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html ------------- Commit messages: - riscv: Break down -XX:+UseRVB into seperate options for each bitmanip extension Changes: https://git.openjdk.java.net/jdk/pull/8032/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8032&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283865 Stats: 132 lines in 8 files changed: 6 ins; 12 del; 114 mod Patch: https://git.openjdk.java.net/jdk/pull/8032.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8032/head:pull/8032 PR: https://git.openjdk.java.net/jdk/pull/8032 From thartmann at openjdk.java.net Wed Mar 30 08:20:47 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 30 Mar 2022 08:20:47 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals [v3] In-Reply-To: <0sQRhgsM2oRxLZwttYTCstpSGkO1kDeQAKTwsbVYsLA=.7eecf031-00e1-43ce-bc60-f6e0d2a6052e@github.com> References: <0sQRhgsM2oRxLZwttYTCstpSGkO1kDeQAKTwsbVYsLA=.7eecf031-00e1-43ce-bc60-f6e0d2a6052e@github.com> Message-ID: <11dadXE8iivpZJVk8rbI_MCkHO0OBUE4t_dsrx7Fo_E=.b6ebd65e-2131-4e3d-995e-f3fb49cfaf5a@github.com> On Fri, 2 Jul 2021 09:55:59 GMT, Wang Huang wrote: >> Wang Huang has updated the pull request incrementally with one additional commit since the last revision: >> >> unroll when small string sizes > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_ >> >> I had to make some changes to the benchmark to get accurate timing, because >> it is swamped by JMH overhead for very small strings. >> >> It should be clear from my patch what I did. The most important part is >> to run the test code in a loop, or you won't see small effects. We're >> trying to measure something that only takes a few nanoseconds. >> >> This is what I see, Apple M1, two equal strings: >> >> Old: >> >> StringEquals.equal 8 avgt 5 0.948 ? 0.001 us/op >> StringEquals.equal 11 avgt 5 0.948 ? 0.004 us/op >> StringEquals.equal 16 avgt 5 0.948 ? 0.001 us/op >> StringEquals.equal 22 avgt 5 1.260 ? 0.002 us/op >> StringEquals.equal 32 avgt 5 1.886 ? 0.001 us/op >> StringEquals.equal 45 avgt 5 2.514 ? 0.001 us/op >> StringEquals.equal 64 avgt 5 3.141 ? 0.003 us/op >> StringEquals.equal 91 avgt 5 4.395 ? 0.002 us/op >> StringEquals.equal 121 avgt 5 5.653 ? 0.014 us/op >> StringEquals.equal 181 avgt 5 8.011 ? 0.010 us/op >> StringEquals.equal 256 avgt 5 11.433 ? 0.014 us/op >> StringEquals.equal 512 avgt 5 23.005 ? 0.124 us/op >> StringEquals.equal 1024 avgt 5 49.185 ? 0.032 us/op >> >> Your patch: >> >> Benchmark (size) Mode Cnt Score Error Units >> StringEquals.equal 8 avgt 5 1.574 ? 0.001 us/op >> StringEquals.equal 11 avgt 5 1.734 ? 0.004 us/op >> StringEquals.equal 16 avgt 5 1.888 ? 0.002 us/op >> StringEquals.equal 22 avgt 5 1.892 ? 0.003 us/op >> StringEquals.equal 32 avgt 5 2.517 ? 0.003 us/op >> StringEquals.equal 45 avgt 5 2.988 ? 0.002 us/op >> StringEquals.equal 64 avgt 5 2.517 ? 0.003 us/op >> StringEquals.equal 91 avgt 5 8.659 ? 0.007 us/op >> StringEquals.equal 121 avgt 5 5.649 ? 0.007 us/op >> StringEquals.equal 181 avgt 5 6.050 ? 0.009 us/op >> StringEquals.equal 256 avgt 5 7.088 ? 0.016 us/op >> StringEquals.equal 512 avgt 5 14.163 ? 0.018 us/op >> StringEquals.equal 1024 avgt 5 29.998 ? 0.052 us/op >> >> As you can see, we're looking at regressions all the way up to size=45, >> with something very odd happening at size=91. Finally the vectorized >> code starts to pull ahead at size=181. >> >> A few things: >> >> You should never be executing the TAIL unless the string is really >> short. Just do one pair of unaligned loads at the end to finish. >> >> Please don't use aliases for rscratch1 and rscratch2. Calling them tmp1 >> and tmp2 doesn't help the reader. >> >> So: please make sure the smaller strings are at least as good as >> they are now. Remember strings are usually short, so we can tolerate >> no regressions with the smaller sizes. >> >> I don't think that Neon does any good here. This is what I get by rewriting >> (just) the stub with scalar registers, in the attached patch: >> >> Benchmark (size) Mode Cnt Score Error Units >> StringEquals.equal 8 avgt 5 1.574 ? 0.004 us/op >> StringEquals.equal 11 avgt 5 1.734 ? 0.003 us/op >> StringEquals.equal 16 avgt 5 1.888 ? 0.002 us/op >> StringEquals.equal 22 avgt 5 1.891 ? 0.003 us/op >> StringEquals.equal 32 avgt 5 2.517 ? 0.001 us/op >> StringEquals.equal 45 avgt 5 2.988 ? 0.002 us/op >> StringEquals.equal 64 avgt 5 2.595 ? 0.004 us/op >> StringEquals.equal 91 avgt 5 4.083 ? 0.006 us/op >> StringEquals.equal 121 avgt 5 5.432 ? 0.006 us/op >> StringEquals.equal 181 avgt 5 6.292 ? 0.009 us/op >> StringEquals.equal 256 avgt 5 7.232 ? 0.008 us/op >> StringEquals.equal 512 avgt 5 13.304 ? 0.012 us/op >> StringEquals.equal 1024 avgt 5 25.537 ? 0.012 us/op >> >> I use an editor with automatic indentation, as do many people, so >> I inserted brackets in the right places in the assembly code. >> >> -- >> Andrew Haley (he/him) >> Java Platform Lead Engineer >> Red Hat UK Ltd. >> https://keybase.io/andrewhaley >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 >> -------------- next part -------------- >> A non-text attachment was scrubbed... >> Name: 8268229.patch >> Type: text/x-patch >> Size: 12464 bytes >> Desc: not available >> URL: > > @theRealAph Thank you for your suggestion. It's my fault that the JMH I used is not accurate. I changed my codes and re-tested under your JMH: > > Before opt? > Benchmark |(size)| Mode| Cnt | Score| Error |Units > -------------------|------|-----|-----|-------|---------|----- > StringEquals.equal | 8| avgt| 5 | 2.334|? 0.012 |us/op > StringEquals.equal | 11| avgt| 5 | 2.335|? 0.012 |us/op > StringEquals.equal | 16| avgt| 5 | 2.334|? 0.011 |us/op > StringEquals.equal | 22| avgt| 5 | 3.414|? 0.422 |us/op > StringEquals.equal | 32| avgt| 5 | 3.890|? 0.004 |us/op > StringEquals.equal | 45| avgt| 5 | 5.610|? 0.023 |us/op > StringEquals.equal | 64| avgt| 5 | 7.215|? 0.009 |us/op > StringEquals.equal | 91| avgt| 5 | 12.305|? 1.716 |us/op > StringEquals.equal | 121| avgt| 5 | 14.891|? 0.085 |us/op > StringEquals.equal | 181| avgt| 5 | 21.502|? 0.050 |us/op > StringEquals.equal | 256| avgt| 5 | 29.968|? 0.155 |us/op > StringEquals.equal | 512| avgt| 5 | 59.414|? 2.341 |us/op > StringEquals.equal | 1024| avgt| 5 |118.365|? 20.794 |us/op > > After opt? > Benchmark |(size)| Mode| Cnt | Score| Error| Units > -------------------|------|-----|-----|------|-------|------ > StringEquals.equal | 8| avgt| 5 | 2.333|? 0.003| us/op > StringEquals.equal | 11| avgt| 5 | 2.333|? 0.001| us/op > StringEquals.equal | 16| avgt| 5 | 2.332|? 0.002| us/op > StringEquals.equal | 22| avgt| 5 | 3.265|? 0.404| us/op > StringEquals.equal | 32| avgt| 5 | 3.875|? 0.002| us/op > StringEquals.equal | 45| avgt| 5 | 5.793|? 0.331| us/op > StringEquals.equal | 64| avgt| 5 | 6.730|? 0.054| us/op > StringEquals.equal | 91| avgt| 5 | 8.611|? 0.075| us/op > StringEquals.equal | 121| avgt| 5 |10.041|? 0.042| us/op > StringEquals.equal | 181| avgt| 5 |13.968|? 0.653| us/op > StringEquals.equal | 256| avgt| 5 |19.199|? 1.227| us/op > StringEquals.equal | 512| avgt| 5 |39.508|? 1.784| us/op > StringEquals.equal | 1024| avgt| 5 |77.883|? 1.290| us/op @Wanghuang-Huawei any plans to re-open and fix this? ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From thartmann at openjdk.java.net Wed Mar 30 08:21:39 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 30 Mar 2022 08:21:39 GMT Subject: RFR: 8265768 [aarch64] Use glibc libm impl for dlog,dlog10,dexp iff 2.29 or greater on AArch64. In-Reply-To: References: Message-ID: On Tue, 25 May 2021 15:32:40 GMT, gregcawthorne wrote: >> Glibc 2.29 onwards provides optimised versions of log,log10,exp. >> These functions have an accuracy of 0.9ulp or better in glibc >> 2.29. >> >> Therefore this patch adds code to parse, store and check >> the runtime glibcs version in os_linux.cpp/hpp. >> This is then used to select the glibcs implementation of >> log, log10, exp at runtime for c1 and c2, iff we have >> glibc 2.29 or greater. >> >> This will ensure OpenJDK can benefit from future improvements >> to glibc. >> >> Glibc adheres to the ieee754 standard, unless stated otherwise >> in its spec. >> >> As there are no stated exceptions in the current glibc spec >> for dlog, dlog10 and dexp, we can assume they currently follow >> ieee754 (which testing confirms). As such, future version of >> glibc are unlikely to lose this compliance with ieee754 in >> future. >> >> W.r.t performance this patch sees ~15-30% performance improvements for >> log and log10, with ~50-80% performance improvements for exp for the >> common input ranged (which output real numbers). However for the NaN >> and inf output ranges we see a slow down of up to a factor of 2 for >> some functions and architectures. >> >> Due to this being the uncommon case we assert that this is a >> worthwhile tradeoff. > > greg.cawthorne at arm.com > > Should work @gregcawthorne any plans to re-open and fix this? ------------- PR: https://git.openjdk.java.net/jdk/pull/3510 From tschatzl at openjdk.java.net Wed Mar 30 08:51:24 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 30 Mar 2022 08:51:24 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers [v2] In-Reply-To: References: Message-ID: <6bTFyyxs88dyy8zMP0VDc0DRQilIL1Epv5lh6lOaer0=.7155cc98-2c22-41ce-ac38-b71aef78bae2@github.com> > Hi all, > > can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? > > This relates to this code snippet: > > int xmm_bypass_limit = FrameMap::nof_xmm_regs; > #ifdef _LP64 > if (UseAVX < 3) { > xmm_bypass_limit = xmm_bypass_limit / 2; > } > #endif > > > Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. > > I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. > > This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). > > Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Put method implementation in hpp file ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7917/files - new: https://git.openjdk.java.net/jdk/pull/7917/files/60c99edf..ad11cee6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7917&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7917&range=00-01 Stats: 21 lines in 2 files changed: 9 ins; 11 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7917.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7917/head:pull/7917 PR: https://git.openjdk.java.net/jdk/pull/7917 From aph at openjdk.java.net Wed Mar 30 09:05:41 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 30 Mar 2022 09:05:41 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals [v3] In-Reply-To: <0sQRhgsM2oRxLZwttYTCstpSGkO1kDeQAKTwsbVYsLA=.7eecf031-00e1-43ce-bc60-f6e0d2a6052e@github.com> References: <0sQRhgsM2oRxLZwttYTCstpSGkO1kDeQAKTwsbVYsLA=.7eecf031-00e1-43ce-bc60-f6e0d2a6052e@github.com> Message-ID: On Fri, 2 Jul 2021 09:55:59 GMT, Wang Huang wrote: >> Wang Huang has updated the pull request incrementally with one additional commit since the last revision: >> >> unroll when small string sizes > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_ >> >> I had to make some changes to the benchmark to get accurate timing, because >> it is swamped by JMH overhead for very small strings. >> >> It should be clear from my patch what I did. The most important part is >> to run the test code in a loop, or you won't see small effects. We're >> trying to measure something that only takes a few nanoseconds. >> >> This is what I see, Apple M1, two equal strings: >> >> Old: >> >> StringEquals.equal 8 avgt 5 0.948 ? 0.001 us/op >> StringEquals.equal 11 avgt 5 0.948 ? 0.004 us/op >> StringEquals.equal 16 avgt 5 0.948 ? 0.001 us/op >> StringEquals.equal 22 avgt 5 1.260 ? 0.002 us/op >> StringEquals.equal 32 avgt 5 1.886 ? 0.001 us/op >> StringEquals.equal 45 avgt 5 2.514 ? 0.001 us/op >> StringEquals.equal 64 avgt 5 3.141 ? 0.003 us/op >> StringEquals.equal 91 avgt 5 4.395 ? 0.002 us/op >> StringEquals.equal 121 avgt 5 5.653 ? 0.014 us/op >> StringEquals.equal 181 avgt 5 8.011 ? 0.010 us/op >> StringEquals.equal 256 avgt 5 11.433 ? 0.014 us/op >> StringEquals.equal 512 avgt 5 23.005 ? 0.124 us/op >> StringEquals.equal 1024 avgt 5 49.185 ? 0.032 us/op >> >> Your patch: >> >> Benchmark (size) Mode Cnt Score Error Units >> StringEquals.equal 8 avgt 5 1.574 ? 0.001 us/op >> StringEquals.equal 11 avgt 5 1.734 ? 0.004 us/op >> StringEquals.equal 16 avgt 5 1.888 ? 0.002 us/op >> StringEquals.equal 22 avgt 5 1.892 ? 0.003 us/op >> StringEquals.equal 32 avgt 5 2.517 ? 0.003 us/op >> StringEquals.equal 45 avgt 5 2.988 ? 0.002 us/op >> StringEquals.equal 64 avgt 5 2.517 ? 0.003 us/op >> StringEquals.equal 91 avgt 5 8.659 ? 0.007 us/op >> StringEquals.equal 121 avgt 5 5.649 ? 0.007 us/op >> StringEquals.equal 181 avgt 5 6.050 ? 0.009 us/op >> StringEquals.equal 256 avgt 5 7.088 ? 0.016 us/op >> StringEquals.equal 512 avgt 5 14.163 ? 0.018 us/op >> StringEquals.equal 1024 avgt 5 29.998 ? 0.052 us/op >> >> As you can see, we're looking at regressions all the way up to size=45, >> with something very odd happening at size=91. Finally the vectorized >> code starts to pull ahead at size=181. >> >> A few things: >> >> You should never be executing the TAIL unless the string is really >> short. Just do one pair of unaligned loads at the end to finish. >> >> Please don't use aliases for rscratch1 and rscratch2. Calling them tmp1 >> and tmp2 doesn't help the reader. >> >> So: please make sure the smaller strings are at least as good as >> they are now. Remember strings are usually short, so we can tolerate >> no regressions with the smaller sizes. >> >> I don't think that Neon does any good here. This is what I get by rewriting >> (just) the stub with scalar registers, in the attached patch: >> >> Benchmark (size) Mode Cnt Score Error Units >> StringEquals.equal 8 avgt 5 1.574 ? 0.004 us/op >> StringEquals.equal 11 avgt 5 1.734 ? 0.003 us/op >> StringEquals.equal 16 avgt 5 1.888 ? 0.002 us/op >> StringEquals.equal 22 avgt 5 1.891 ? 0.003 us/op >> StringEquals.equal 32 avgt 5 2.517 ? 0.001 us/op >> StringEquals.equal 45 avgt 5 2.988 ? 0.002 us/op >> StringEquals.equal 64 avgt 5 2.595 ? 0.004 us/op >> StringEquals.equal 91 avgt 5 4.083 ? 0.006 us/op >> StringEquals.equal 121 avgt 5 5.432 ? 0.006 us/op >> StringEquals.equal 181 avgt 5 6.292 ? 0.009 us/op >> StringEquals.equal 256 avgt 5 7.232 ? 0.008 us/op >> StringEquals.equal 512 avgt 5 13.304 ? 0.012 us/op >> StringEquals.equal 1024 avgt 5 25.537 ? 0.012 us/op >> >> I use an editor with automatic indentation, as do many people, so >> I inserted brackets in the right places in the assembly code. >> >> -- >> Andrew Haley (he/him) >> Java Platform Lead Engineer >> Red Hat UK Ltd. >> https://keybase.io/andrewhaley >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 >> -------------- next part -------------- >> A non-text attachment was scrubbed... >> Name: 8268229.patch >> Type: text/x-patch >> Size: 12464 bytes >> Desc: not available >> URL: > > @theRealAph Thank you for your suggestion. It's my fault that the JMH I used is not accurate. I changed my codes and re-tested under your JMH: > > Before opt? > Benchmark |(size)| Mode| Cnt | Score| Error |Units > -------------------|------|-----|-----|-------|---------|----- > StringEquals.equal | 8| avgt| 5 | 2.334|? 0.012 |us/op > StringEquals.equal | 11| avgt| 5 | 2.335|? 0.012 |us/op > StringEquals.equal | 16| avgt| 5 | 2.334|? 0.011 |us/op > StringEquals.equal | 22| avgt| 5 | 3.414|? 0.422 |us/op > StringEquals.equal | 32| avgt| 5 | 3.890|? 0.004 |us/op > StringEquals.equal | 45| avgt| 5 | 5.610|? 0.023 |us/op > StringEquals.equal | 64| avgt| 5 | 7.215|? 0.009 |us/op > StringEquals.equal | 91| avgt| 5 | 12.305|? 1.716 |us/op > StringEquals.equal | 121| avgt| 5 | 14.891|? 0.085 |us/op > StringEquals.equal | 181| avgt| 5 | 21.502|? 0.050 |us/op > StringEquals.equal | 256| avgt| 5 | 29.968|? 0.155 |us/op > StringEquals.equal | 512| avgt| 5 | 59.414|? 2.341 |us/op > StringEquals.equal | 1024| avgt| 5 |118.365|? 20.794 |us/op > > After opt? > Benchmark |(size)| Mode| Cnt | Score| Error| Units > -------------------|------|-----|-----|------|-------|------ > StringEquals.equal | 8| avgt| 5 | 2.333|? 0.003| us/op > StringEquals.equal | 11| avgt| 5 | 2.333|? 0.001| us/op > StringEquals.equal | 16| avgt| 5 | 2.332|? 0.002| us/op > StringEquals.equal | 22| avgt| 5 | 3.265|? 0.404| us/op > StringEquals.equal | 32| avgt| 5 | 3.875|? 0.002| us/op > StringEquals.equal | 45| avgt| 5 | 5.793|? 0.331| us/op > StringEquals.equal | 64| avgt| 5 | 6.730|? 0.054| us/op > StringEquals.equal | 91| avgt| 5 | 8.611|? 0.075| us/op > StringEquals.equal | 121| avgt| 5 |10.041|? 0.042| us/op > StringEquals.equal | 181| avgt| 5 |13.968|? 0.653| us/op > StringEquals.equal | 256| avgt| 5 |19.199|? 1.227| us/op > StringEquals.equal | 512| avgt| 5 |39.508|? 1.784| us/op > StringEquals.equal | 1024| avgt| 5 |77.883|? 1.290| us/op > @Wanghuang-Huawei any plans to re-open and fix this? I hope not: it looks like a regression for common cases. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From tschatzl at openjdk.java.net Wed Mar 30 09:28:42 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 30 Mar 2022 09:28:42 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers [v2] In-Reply-To: References: Message-ID: On Tue, 29 Mar 2022 22:34:13 GMT, Vladimir Kozlov wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Put method implementation in hpp file > > src/hotspot/cpu/x86/register_x86.hpp line 169: > >> 167: // Actually available XMM registers for use, depending on actual CPU capabilities >> 168: // and flags. >> 169: static int available_xmm_registers(); > > Why not define function's body here? I do not think its use is performance critical considering its use in the surrounding code then that goes on and does something per xmm register. There is no strong opinion from me about this, so I moved it to the hpp file. Build times do not significantly change afaict. ------------- PR: https://git.openjdk.java.net/jdk/pull/7917 From tobias.hartmann at oracle.com Wed Mar 30 09:31:29 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 30 Mar 2022 11:31:29 +0200 Subject: Result: New HotSpot Group Member: Dean Long Message-ID: The vote for Dean Long [1] is now closed. Yes: 18 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Best regards, Tobias [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2022-March/058543.html From tobias.hartmann at oracle.com Wed Mar 30 09:31:27 2022 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 30 Mar 2022 11:31:27 +0200 Subject: Result: New HotSpot Group Member: Vladimir Ivanov Message-ID: The vote for Vladimir Ivanov [1] is now closed. Yes: 18 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Best regards, Tobias [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2022-March/058542.html From xgong at openjdk.java.net Wed Mar 30 10:38:19 2022 From: xgong at openjdk.java.net (Xiaohong Gong) Date: Wed, 30 Mar 2022 10:38:19 GMT Subject: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature Message-ID: Currently the vector load with mask when the given index happens out of the array boundary is implemented with pure java scalar code to avoid the IOOBE (IndexOutOfBoundaryException). This is necessary for architectures that do not support the predicate feature. Because the masked load is implemented with a full vector load and a vector blend applied on it. And a full vector load will definitely cause the IOOBE which is not valid. However, for architectures that support the predicate feature like SVE/AVX-512/RVV, it can be vectorized with the predicated load instruction as long as the indexes of the masked lanes are within the bounds of the array. For these architectures, loading with unmasked lanes does not raise exception. This patch adds the vectorization support for the masked load with IOOBE part. Please see the original java implementation (FIXME: optimize): @ForceInline public static ByteVector fromArray(VectorSpecies species, byte[] a, int offset, VectorMask m) { ByteSpecies vsp = (ByteSpecies) species; if (offset >= 0 && offset <= (a.length - species.length())) { return vsp.dummyVector().fromArray0(a, offset, m); } // FIXME: optimize checkMaskFromIndexSize(offset, vsp, m, 1, a.length); return vsp.vOp(m, i -> a[offset + i]); } Since it can only be vectorized with the predicate load, the hotspot must check whether the current backend supports it and falls back to the java scalar version if not. This is different from the normal masked vector load that the compiler will generate a full vector load and a vector blend if the predicate load is not supported. So to let the compiler make the expected action, an additional flag (i.e. `usePred`) is added to the existing "loadMasked" intrinsic, with the value "true" for the IOOBE part while "false" for the normal load. And the compiler will fail to intrinsify if the flag is "true" and the predicate load is not supported by the backend, which means that normal java path will be executed. Also adds the same vectorization support for masked: - fromByteArray/fromByteBuffer - fromBooleanArray - fromCharArray The performance for the new added benchmarks improve about `1.88x ~ 30.26x` on the x86 AVX-512 system: Benchmark before After Units LoadMaskedIOOBEBenchmark.byteLoadArrayMaskIOOBE 737.542 1387.069 ops/ms LoadMaskedIOOBEBenchmark.doubleLoadArrayMaskIOOBE 118.366 330.776 ops/ms LoadMaskedIOOBEBenchmark.floatLoadArrayMaskIOOBE 233.832 6125.026 ops/ms LoadMaskedIOOBEBenchmark.intLoadArrayMaskIOOBE 233.816 7075.923 ops/ms LoadMaskedIOOBEBenchmark.longLoadArrayMaskIOOBE 119.771 330.587 ops/ms LoadMaskedIOOBEBenchmark.shortLoadArrayMaskIOOBE 431.961 939.301 ops/ms Similar performance gain can also be observed on 512-bit SVE system. ------------- Commit messages: - 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature Changes: https://git.openjdk.java.net/jdk/pull/8035/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8035&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8283667 Stats: 821 lines in 43 files changed: 314 ins; 117 del; 390 mod Patch: https://git.openjdk.java.net/jdk/pull/8035.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8035/head:pull/8035 PR: https://git.openjdk.java.net/jdk/pull/8035 From duke at openjdk.java.net Wed Mar 30 11:38:38 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Wed, 30 Mar 2022 11:38:38 GMT Subject: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature In-Reply-To: References: Message-ID: On Wed, 30 Mar 2022 10:31:59 GMT, Xiaohong Gong wrote: > Currently the vector load with mask when the given index happens out of the array boundary is implemented with pure java scalar code to avoid the IOOBE (IndexOutOfBoundaryException). This is necessary for architectures that do not support the predicate feature. Because the masked load is implemented with a full vector load and a vector blend applied on it. And a full vector load will definitely cause the IOOBE which is not valid. However, for architectures that support the predicate feature like SVE/AVX-512/RVV, it can be vectorized with the predicated load instruction as long as the indexes of the masked lanes are within the bounds of the array. For these architectures, loading with unmasked lanes does not raise exception. > > This patch adds the vectorization support for the masked load with IOOBE part. Please see the original java implementation (FIXME: optimize): > > > @ForceInline > public static > ByteVector fromArray(VectorSpecies species, > byte[] a, int offset, > VectorMask m) { > ByteSpecies vsp = (ByteSpecies) species; > if (offset >= 0 && offset <= (a.length - species.length())) { > return vsp.dummyVector().fromArray0(a, offset, m); > } > > // FIXME: optimize > checkMaskFromIndexSize(offset, vsp, m, 1, a.length); > return vsp.vOp(m, i -> a[offset + i]); > } > > Since it can only be vectorized with the predicate load, the hotspot must check whether the current backend supports it and falls back to the java scalar version if not. This is different from the normal masked vector load that the compiler will generate a full vector load and a vector blend if the predicate load is not supported. So to let the compiler make the expected action, an additional flag (i.e. `usePred`) is added to the existing "loadMasked" intrinsic, with the value "true" for the IOOBE part while "false" for the normal load. And the compiler will fail to intrinsify if the flag is "true" and the predicate load is not supported by the backend, which means that normal java path will be executed. > > Also adds the same vectorization support for masked: > - fromByteArray/fromByteBuffer > - fromBooleanArray > - fromCharArray > > The performance for the new added benchmarks improve about `1.88x ~ 30.26x` on the x86 AVX-512 system: > > Benchmark before After Units > LoadMaskedIOOBEBenchmark.byteLoadArrayMaskIOOBE 737.542 1387.069 ops/ms > LoadMaskedIOOBEBenchmark.doubleLoadArrayMaskIOOBE 118.366 330.776 ops/ms > LoadMaskedIOOBEBenchmark.floatLoadArrayMaskIOOBE 233.832 6125.026 ops/ms > LoadMaskedIOOBEBenchmark.intLoadArrayMaskIOOBE 233.816 7075.923 ops/ms > LoadMaskedIOOBEBenchmark.longLoadArrayMaskIOOBE 119.771 330.587 ops/ms > LoadMaskedIOOBEBenchmark.shortLoadArrayMaskIOOBE 431.961 939.301 ops/ms > > Similar performance gain can also be observed on 512-bit SVE system. AVX has `vmaskmovpd` and `vmaskmovps` for masked loads and stores, which do not required predicate vectors. I think the implementation should make it possible to take advantage of these instructions. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/8035 From fyang at openjdk.java.net Wed Mar 30 12:30:38 2022 From: fyang at openjdk.java.net (Fei Yang) Date: Wed, 30 Mar 2022 12:30:38 GMT Subject: RFR: JDK-8283865: riscv: Break down -XX:+UseRVB into seperate options for each bitmanip extension In-Reply-To: References: Message-ID: <4sGs352_X10iUZxRUCj5JnMI6sZzjUIjCTU0L3uUvts=.cf99c3ce-20a2-4fe0-866b-8852790bbba9@github.com> On Wed, 30 Mar 2022 07:02:06 GMT, Feilong Jiang wrote: > Currently openjdk riscv supports RISC-V bitmanip extension as a bundle while spec provides four individual extensions: Zb[abcs][1]. > > According to the spec, we need to break down `UseRVB` into two individual options `UseZba` and `UseZbb` to enable or disable Zba and Zbb respectively (openjdk riscv only supports Zba and Zbb for now). > > Since multi-letter extensions representation in the ISA bitmap is still not determined [2][3], availability for those extensions could not be queried from HWCAP. Feature detection of Zba and Zbb was removed temporarily. > > Linux RISCV64 release hotspot/jdk tier1 tests are passed on QEMU with following options: > - [x] +UseZba && +UseZbb > - [x] +UseZba && -UseZbb > - [x] -UseZba && +UseZbb > > [1]: https://github.com/riscv/riscv-bitmanip/releases/download/1.0.0/bitmanip-1.0.0-38-g865e7a7.pdf > [2]: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010250.html > [3]: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1524: > 1522: // Rd[31:0] = Rs[23:16] Rs[31:24] Rs[7:0] Rs[15:8] (zero-extend to 64 bits) > 1523: void MacroAssembler::revb_h_w_u(Register Rd, Register Rs, Register tmp1, Register tmp2) { > 1524: if (UseZba && UseZbb) { Requiring availability of both ISA-extensions here might not be a good idea in respect of performance. We should have more fine-grained distinguishment making use of instructions for each ISA-extension when possible. ------------- PR: https://git.openjdk.java.net/jdk/pull/8032 From ccheung at openjdk.java.net Wed Mar 30 15:51:39 2022 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 30 Mar 2022 15:51:39 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v3] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Tue, 29 Mar 2022 05:41:19 GMT, Ioi Lam wrote: >> - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` >> - Call the appropriate parsing function according to the type of the flag >> - Added more test cases for flags of the `double` type. >> >> As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. >> >> Tested with tiers 1-5. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > moved comment Looks good. I have a question. src/hotspot/share/runtime/arguments.cpp line 1069: > 1067: > 1068: size_t name_len = size_t(arg - name); > 1069: JVMFlag* flag = find_jvm_flag(name, name_len); Consider -XX:@blah I think the name_len could be 0 if the preceding while loop did not increment the arg pointer. But I think it is ok because find_jvm_flag would return NULL on a zero name_len. Is it correct? ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7916 From duke at openjdk.java.net Wed Mar 30 16:17:38 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Wed, 30 Mar 2022 16:17:38 GMT Subject: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature In-Reply-To: References: Message-ID: On Wed, 30 Mar 2022 10:31:59 GMT, Xiaohong Gong wrote: > Currently the vector load with mask when the given index happens out of the array boundary is implemented with pure java scalar code to avoid the IOOBE (IndexOutOfBoundaryException). This is necessary for architectures that do not support the predicate feature. Because the masked load is implemented with a full vector load and a vector blend applied on it. And a full vector load will definitely cause the IOOBE which is not valid. However, for architectures that support the predicate feature like SVE/AVX-512/RVV, it can be vectorized with the predicated load instruction as long as the indexes of the masked lanes are within the bounds of the array. For these architectures, loading with unmasked lanes does not raise exception. > > This patch adds the vectorization support for the masked load with IOOBE part. Please see the original java implementation (FIXME: optimize): > > > @ForceInline > public static > ByteVector fromArray(VectorSpecies species, > byte[] a, int offset, > VectorMask m) { > ByteSpecies vsp = (ByteSpecies) species; > if (offset >= 0 && offset <= (a.length - species.length())) { > return vsp.dummyVector().fromArray0(a, offset, m); > } > > // FIXME: optimize > checkMaskFromIndexSize(offset, vsp, m, 1, a.length); > return vsp.vOp(m, i -> a[offset + i]); > } > > Since it can only be vectorized with the predicate load, the hotspot must check whether the current backend supports it and falls back to the java scalar version if not. This is different from the normal masked vector load that the compiler will generate a full vector load and a vector blend if the predicate load is not supported. So to let the compiler make the expected action, an additional flag (i.e. `usePred`) is added to the existing "loadMasked" intrinsic, with the value "true" for the IOOBE part while "false" for the normal load. And the compiler will fail to intrinsify if the flag is "true" and the predicate load is not supported by the backend, which means that normal java path will be executed. > > Also adds the same vectorization support for masked: > - fromByteArray/fromByteBuffer > - fromBooleanArray > - fromCharArray > > The performance for the new added benchmarks improve about `1.88x ~ 30.26x` on the x86 AVX-512 system: > > Benchmark before After Units > LoadMaskedIOOBEBenchmark.byteLoadArrayMaskIOOBE 737.542 1387.069 ops/ms > LoadMaskedIOOBEBenchmark.doubleLoadArrayMaskIOOBE 118.366 330.776 ops/ms > LoadMaskedIOOBEBenchmark.floatLoadArrayMaskIOOBE 233.832 6125.026 ops/ms > LoadMaskedIOOBEBenchmark.intLoadArrayMaskIOOBE 233.816 7075.923 ops/ms > LoadMaskedIOOBEBenchmark.longLoadArrayMaskIOOBE 119.771 330.587 ops/ms > LoadMaskedIOOBEBenchmark.shortLoadArrayMaskIOOBE 431.961 939.301 ops/ms > > Similar performance gain can also be observed on 512-bit SVE system. src/hotspot/share/opto/vectorIntrinsics.cpp line 1234: > 1232: bool use_predicate = false; > 1233: if (is_store) { > 1234: // Masked vector store always uses the predicated store. Can masked store be implemented as Load + Blend + Store? ------------- PR: https://git.openjdk.java.net/jdk/pull/8035 From kvn at openjdk.java.net Wed Mar 30 17:31:41 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 30 Mar 2022 17:31:41 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers [v2] In-Reply-To: <6bTFyyxs88dyy8zMP0VDc0DRQilIL1Epv5lh6lOaer0=.7155cc98-2c22-41ce-ac38-b71aef78bae2@github.com> References: <6bTFyyxs88dyy8zMP0VDc0DRQilIL1Epv5lh6lOaer0=.7155cc98-2c22-41ce-ac38-b71aef78bae2@github.com> Message-ID: On Wed, 30 Mar 2022 08:51:24 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? >> >> This relates to this code snippet: >> >> int xmm_bypass_limit = FrameMap::nof_xmm_regs; >> #ifdef _LP64 >> if (UseAVX < 3) { >> xmm_bypass_limit = xmm_bypass_limit / 2; >> } >> #endif >> >> >> Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. >> >> I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. >> >> This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). >> >> Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Put method implementation in hpp file Good. In 32-bit VM it would just one instruction when inlined. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7917 From mcimadamore at openjdk.java.net Wed Mar 30 18:06:28 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 30 Mar 2022 18:06:28 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v14] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix bad usage of `@link` with primitive array types ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/43dc6be3..0bcc8664 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=12-13 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From tschatzl at openjdk.java.net Wed Mar 30 18:17:36 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 30 Mar 2022 18:17:36 GMT Subject: RFR: 8283494: Factor out calculation of actual number of XMM registers [v2] In-Reply-To: References: Message-ID: On Tue, 29 Mar 2022 22:12:30 GMT, Dean Long wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Put method implementation in hpp file > > LGTM. Thanks @dean-long @vnkozlov for your review ------------- PR: https://git.openjdk.java.net/jdk/pull/7917 From tschatzl at openjdk.java.net Wed Mar 30 18:17:37 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 30 Mar 2022 18:17:37 GMT Subject: Integrated: 8283494: Factor out calculation of actual number of XMM registers In-Reply-To: References: Message-ID: On Wed, 23 Mar 2022 08:52:41 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that factors out calculation of the actually available number of XMM registers on a given processor/given command line options into a method and reuse that as much as possible? > > This relates to this code snippet: > > int xmm_bypass_limit = FrameMap::nof_xmm_regs; > #ifdef _LP64 > if (UseAVX < 3) { > xmm_bypass_limit = xmm_bypass_limit / 2; > } > #endif > > > Also, there is already the method `FrameMap::get_num_caller_save_xmms()` that has been updated to use that new method `XMMRegisterImpl::available_xmm_registers()`; further I tried to appropriately use `FrameMap::get_num_caller_save_xmms` in the places where either would work. Please have a look in particular about that. > > I did not change strange to me variable names like `xmm_bypass_limit` above as they probably make some sense to somebody as it's used quite often. > > This also fixes a compilation error without configuring C1 introduced with [JDK-8283327](https://bugs.openjdk.java.net/browse/JDK-8283327). > > Testing: tier1-5 (all but linux-aarch64 done), gha, local compilation with `configure --with-features=-compiler1`. > > Thanks, > Thomas This pull request has now been integrated. Changeset: ce27d9dd Author: Thomas Schatzl URL: https://git.openjdk.java.net/jdk/commit/ce27d9dd5e1899c74ca2120e3e70420973eb241c Stats: 67 lines in 8 files changed: 16 ins; 37 del; 14 mod 8283494: Factor out calculation of actual number of XMM registers Reviewed-by: dlong, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/7917 From iklam at openjdk.java.net Wed Mar 30 20:39:16 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 30 Mar 2022 20:39:16 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v4] In-Reply-To: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: > - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` > - Call the appropriate parsing function according to the type of the flag > - Added more test cases for flags of the `double` type. > > As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. > > Tested with tiers 1-5. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @calvinccheung comment: check for zero-length flag name ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7916/files - new: https://git.openjdk.java.net/jdk/pull/7916/files/6fb2ccef..3f1f28f7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7916&range=02-03 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7916.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7916/head:pull/7916 PR: https://git.openjdk.java.net/jdk/pull/7916 From iklam at openjdk.java.net Wed Mar 30 20:39:19 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 30 Mar 2022 20:39:19 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v3] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Wed, 30 Mar 2022 15:47:23 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> moved comment > > src/hotspot/share/runtime/arguments.cpp line 1069: > >> 1067: >> 1068: size_t name_len = size_t(arg - name); >> 1069: JVMFlag* flag = find_jvm_flag(name, name_len); > > Consider -XX:@blah > I think the name_len could be 0 if the preceding while loop did not increment the arg pointer. > But I think it is ok because find_jvm_flag would return NULL on a zero name_len. > Is it correct? The code does handle the case where `name_len==0`, but I added a new check anyway to make the code easier to understand. ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From iklam at openjdk.java.net Wed Mar 30 20:39:19 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 30 Mar 2022 20:39:19 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v3] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: <0c5wFFbT3YhEr2DrqAPpSEYsDoYI6hTwT0Buao2gRz0=.d4720481-8e39-4365-893c-8494eecdd8f3@github.com> On Wed, 30 Mar 2022 20:34:57 GMT, Ioi Lam wrote: >> src/hotspot/share/runtime/arguments.cpp line 1069: >> >>> 1067: >>> 1068: size_t name_len = size_t(arg - name); >>> 1069: JVMFlag* flag = find_jvm_flag(name, name_len); >> >> Consider -XX:@blah >> I think the name_len could be 0 if the preceding while loop did not increment the arg pointer. >> But I think it is ok because find_jvm_flag would return NULL on a zero name_len. >> Is it correct? > > The code does handle the case where `name_len==0`, but I added a new check anyway to make the code easier to understand. I tested a few cases and the results are the same before/after adding the `name_len` check. $ java -XX:SharedBaseAddress=0x10000000 --version java 19-internal 2022-09-20 Java(TM) SE Runtime Environment (slowdebug build 19-internal-adhoc.iklam.ken) Java HotSpot(TM) 64-Bit Server VM (slowdebug build 19-internal-adhoc.iklam.ken, mixed mode) $ java -XX:XSharedBaseAddress=0x10000000 --version Unrecognized VM option 'XSharedBaseAddress=0x10000000' Did you mean 'SharedBaseAddress='? Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. $ java -XX:@SharedBaseAddress=0x10000000 --version Unrecognized VM option '@SharedBaseAddress=0x10000000' Did you mean 'SharedBaseAddress='? Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. $ java -XX:=0x10000000 --version Unrecognized VM option '=0x10000000' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. $ java -XX:@=0x10000000 --version Unrecognized VM option '@=0x10000000' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From ccheung at openjdk.java.net Wed Mar 30 20:56:40 2022 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 30 Mar 2022 20:56:40 GMT Subject: RFR: 8283013: Simplify Arguments::parse_argument() [v4] In-Reply-To: References: <1ZNBw1bE2iANq4sAaTJCYC1pBdwNbysY_X_X193ZK6o=.f4a0bbea-779e-4f13-9d4a-08f34083f2e6@github.com> Message-ID: On Wed, 30 Mar 2022 20:39:16 GMT, Ioi Lam wrote: >> - Remove all the complex `sscanf()` calls in `Arguments::parse_argument()` >> - Call the appropriate parsing function according to the type of the flag >> - Added more test cases for flags of the `double` type. >> >> As a result of this change, `double` flags can now be specified in more ways, as long as the input is accepted by `strtod()`. However, `NaN` and `INFINITY` values are not allowed because the VM probably cannot handle them. Please see the test case for details. >> >> Tested with tiers 1-5. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @calvinccheung comment: check for zero-length flag name Marked as reviewed by ccheung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7916 From mcimadamore at openjdk.java.net Wed Mar 30 20:59:34 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 30 Mar 2022 20:59:34 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v15] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Tweak FunctionDescriptor::argumentLayouts to return an immutable list ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7888/files - new: https://git.openjdk.java.net/jdk/pull/7888/files/0bcc8664..af41a76c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=14 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=13-14 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From mcimadamore at openjdk.java.net Wed Mar 30 21:51:16 2022 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 30 Mar 2022 21:51:16 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v16] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/424 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: - Merge branch 'master' into foreign-preview - Tweak FunctionDescriptor::argumentLayouts to return an immutable list - Fix bad usage of `@link` with primitive array types - Switch to daemon threads for async upcalls - Use thread local storage to optimize attach of async threads - Drop support for Constable from MemoryLayout/FunctionDescriptor - Merge branch 'master' into foreign-preview - Revert changes to RunTests.gmk - Add --enable-preview to micro benchmark java options - Address more review comments - ... and 21 more: https://git.openjdk.java.net/jdk/compare/ce27d9dd...247e5eb5 ------------- Changes: https://git.openjdk.java.net/jdk/pull/7888/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7888&range=15 Stats: 64862 lines in 366 files changed: 43028 ins; 19321 del; 2513 mod Patch: https://git.openjdk.java.net/jdk/pull/7888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7888/head:pull/7888 PR: https://git.openjdk.java.net/jdk/pull/7888 From duke at openjdk.java.net Wed Mar 30 22:25:44 2022 From: duke at openjdk.java.net (ExE Boss) Date: Wed, 30 Mar 2022 22:25:44 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v15] In-Reply-To: References: Message-ID: On Wed, 30 Mar 2022 20:59:34 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Tweak FunctionDescriptor::argumentLayouts to return an immutable list src/java.base/share/classes/java/lang/foreign/FunctionDescriptor.java line 73: > 71: */ > 72: public List argumentLayouts() { > 73: return Collections.unmodifiableList(argLayouts); This?change doesn?t?seem to?be?necessary, as?`FunctionDescriptor` is?already?created?with?`List.of(?)` (or?`Stream.toList()` in?the?case of?`FunctionDescriptor.VariadicFunction`), which?produce immutable?lists (although?`Stream.toList()` permits?`null`s, which?`Stream.collect(Collectors.toImmutableList())` and?`List.of(?)`?doesn?t). ------------- PR: https://git.openjdk.java.net/jdk/pull/7888 From psandoz at openjdk.java.net Wed Mar 30 22:44:42 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 30 Mar 2022 22:44:42 GMT Subject: RFR: 8282191: Implementation of Foreign Function & Memory API (Preview) [v16] In-Reply-To: References: Message-ID: On Wed, 30 Mar 2022 21:51:16 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-424 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/424 > > Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: > > - Merge branch 'master' into foreign-preview > - Tweak FunctionDescriptor::argumentLayouts to return an immutable list > - Fix bad usage of `@link` with primitive array types > - Switch to daemon threads for async upcalls > - Use thread local storage to optimize attach of async threads > - Drop support for Constable from MemoryLayout/FunctionDescriptor > - Merge branch 'master' into foreign-preview > - Revert changes to RunTests.gmk > - Add --enable-preview to micro benchmark java options > - Address more review comments > - ... and 21 more: https://git.openjdk.java.net/jdk/compare/ce27d9dd...247e5eb5 Java code looks good (i did not go through the tests). As is common no comments, since code was reviewed in smaller steps in the panama-foreign respository. ------------- Marked as reviewed by psandoz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7888 From duke at openjdk.java.net Wed Mar 30 23:11:22 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 30 Mar 2022 23:11:22 GMT Subject: RFR: 8283726: x86 intrinsics for compare method in Integer and Long [v2] In-Reply-To: References: Message-ID: > Implements x86 intrinsics for compare() method in java.lang.Integer and java.lang.Long. Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - refactored x86_64.ad code to macro assembly routines - Merge branch 'master' of https://git.openjdk.java.net/jdk into cmp - add JMH benchmarks - 8283726: x86 intrinsics for compare method in Integer and Long ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7975/files - new: https://git.openjdk.java.net/jdk/pull/7975/files/b0c3314d..79e4aa50 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7975&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7975&range=00-01 Stats: 5452 lines in 218 files changed: 3925 ins; 856 del; 671 mod Patch: https://git.openjdk.java.net/jdk/pull/7975.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7975/head:pull/7975 PR: https://git.openjdk.java.net/jdk/pull/7975 From kim.barrett at oracle.com Thu Mar 31 21:16:58 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 31 Mar 2022 21:16:58 +0000 Subject: Result: New HotSpot Group Member: Sangheon Kim Message-ID: <74B7A7C3-E578-4E72-8947-162610489726@oracle.com> The vote for Sangheon Kim [1] is now closed. Yes: 13 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Kim Barrett [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2022-March/058562.html From kim.barrett at oracle.com Thu Mar 31 21:18:53 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 31 Mar 2022 21:18:53 +0000 Subject: Result: New HotSpot Group Member: Ivan Walulya Message-ID: <5A892087-9421-4935-8719-C4AECEF1D796@oracle.com> The vote for Ivan Walulya [1] is now closed. Yes: 11 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Kim Barrett [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2022-March/058564.html From kim.barrett at oracle.com Thu Mar 31 21:25:03 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 31 Mar 2022 21:25:03 +0000 Subject: Result: New HotSpot Group Member: Leo Korinth Message-ID: <13F180E9-1C4B-46EB-A6ED-91056945B5D0@oracle.com> The vote for Leo Korinth [1] is now closed. Yes: 15 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Kim Barrett [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2022-March/058565.html From kim.barrett at oracle.com Thu Mar 31 21:27:49 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 31 Mar 2022 21:27:49 +0000 Subject: Result: New HotSpot Group Member: Albert Mingkun Yang Message-ID: The vote for Albert Mingkun Yang [1] is now closed. Yes: 11 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Kim Barrett [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2022-March/058568.html